In our Culture and Methodologies category, dive into Agile, career development, team management, and methodologies such as Waterfall, Lean, and Kanban. Whether you're looking for tips on how to integrate Scrum theory into your team's Agile practices or you need help prepping for your next interview, our resources can help set you up for success.
The Agile methodology is a project management approach that breaks larger projects into several phases. It is a process of planning, executing, and evaluating with stakeholders. Our resources provide information on processes and tools, documentation, customer collaboration, and adjustments to make when planning meetings.
There are several paths to starting a career in software development, including the more non-traditional routes that are now more accessible than ever. Whether you're interested in front-end, back-end, or full-stack development, we offer more than 10,000 resources that can help you grow your current career or *develop* a new one.
Agile, Waterfall, and Lean are just a few of the project-centric methodologies for software development that you'll find in this Zone. Whether your team is focused on goals like achieving greater speed, having well-defined project scopes, or using fewer resources, the approach you adopt will offer clear guidelines to help structure your team's work. In this Zone, you'll find resources on user stories, implementation examples, and more to help you decide which methodology is the best fit and apply it in your development practices.
Development team management involves a combination of technical leadership, project management, and the ability to grow and nurture a team. These skills have never been more important, especially with the rise of remote work both across industries and around the world. The ability to delegate decision-making is key to team engagement. Review our inventory of tutorials, interviews, and first-hand accounts of improving the team dynamic.
Kubernetes in the Enterprise
In 2022, Kubernetes has become a central component for containerized applications. And it is nowhere near its peak. In fact, based on our research, 94 percent of survey respondents believe that Kubernetes will be a bigger part of their system design over the next two to three years. With the expectations of Kubernetes becoming more entrenched into systems, what do the adoption and deployment methods look like compared to previous years?DZone's Kubernetes in the Enterprise Trend Report provides insights into how developers are leveraging Kubernetes in their organizations. It focuses on the evolution of Kubernetes beyond container orchestration, advancements in Kubernetes observability, Kubernetes in AI and ML, and more. Our goal for this Trend Report is to help inspire developers to leverage Kubernetes in their own organizations.
When you’re navigating new territory, it is essential to have a guide. And if you want to grow your skills as rapidly as possible, it helps to learn from someone who has done it before. Why Use an Engineering Advisor? An advisor is someone who helps you produce better results with your work. They: Suggest new ways to solve problems. Help you navigate problems you haven’t dealt with before. Point out mistakes you might be making. Accelerate your career growth. Provide new mental models for thinking about your work, giving you more tools for your leadership toolkit. How Do Executive Advisors Work? Every advisor is different, so you should find out what approach a prospective advisor uses. I think my approach is fairly common, so let me share how it works with my practice: With most of my clients, we conduct working sessions. They are one-hour sessions and are weekly, biweekly, or monthly. In these sessions, my client brings their current problems, and we problem-solve them together. Each of us brings an essential part to the working session. I bring a lot of experience (both in terms of years doing this, and across many companies). I’ve seen it all before. My clients have their own experience and expertise. But most importantly, they have much better context on the problem and environment. We talk through a few ways to solve each problem, and we review the tradeoffs of those approaches. I usually suggest solutions my clients wouldn’t consider, and I share tradeoffs they would miss. They, in turn, point out ways the environment may bias the solution in a particular direction. Some of my clients use me to test their plans. They’ll share what they’re thinking, and have me walk through the tradeoffs, or suggest some alternatives they might not have thought of. We walk out of these sessions with solid plans and new ways of addressing problems. It’s like pair programming, but for leadership! Other advisors may have different styles or ways of operating, so be sure to inquire into their approach. When To Use an Advisor I see several profiles of leaders who reach out for advisory help: Leaders who are in new territory: Typically leaders approach me when they’re getting in a little over their heads, or their current approaches seem to break down. (This often happens at ~20 engineers or ~50 engineers in an organization). Symptoms are too many direct reports, too much relying on them, engineering velocity slowing down, or quality problems emerging. Leaders who are promising but need support: Often very talented leaders who have a little fewer years of experience seek out advisors to make sure they’re well supported. Sometimes their boss may even insist on it. This can help bright and capable leaders step into situations that may otherwise feel risky for the company. (Hint: You can use that to your advantage to get companies to take a risk on you by pairing the responsibility with an advisor.) Leaders who are growth-minded and want to be as effective as possible: These leaders are aware that their growth is magnified if they get good feedback on their work, so they seek to accelerate their growth by choosing an advisor who will give them new perspectives. Leaders who are part of a work culture that encourages growing their leadership: Some companies invest in their leaders because they view it as a high-leverage place to invest. Making a leader 10% better has a greater than 10% impact on their area of the company. In these companies, most of the senior leaders have executive advisors or coaches. How To Choose an Advisor I recommend you interview a couple of advisors to determine which person is best for you. Ask people in your network for recommendations. Select a few, and contact them. With my potential clients, I like to do a practice session. This gives the potential client a feeling of what it’s like to work together. Ideally, they will do the same practice session with a couple of advisors, and see what kind of insight they get from each person. They should also evaluate how comfortable they feel being vulnerable or showing their weak areas with the advisor. Do keep in mind that sessions will get better over time as you develop a working relationship together, but the sample session will give you an idea of what it’s like to work with the person. I recommend asking for a sample session. What To Look For What should you look for in an advisor? Someone who has been through the stages you’re going through Someone you communicate well with Someone you can be vulnerable with Someone who listens as much as they speak Someone who doesn’t have the same solution to every situation The last one is something I think it’s important to emphasize. How flexible is the advisor’s thinking? Do they respond to every situation with the same solution? That’s often a red flag. Rates for Advisors Rates vary greatly depending on the advisor. I’ve seen rates that vary by an order of magnitude! If you’re on a budget, you can probably find someone fairly inexpensive. But you have to be careful with your choices. You’re hiring someone for their expertise, not just for them to coach you (more on the difference between coaches and advisors later). Ideally, your company should be paying for the advisor. What I tell companies is to use this heuristic to determine whether the rates are reasonable for an advisor: If you took the yearly cost of the executive advisor, and compared it to how much it would cost to hire an additional engineer, how does it compare? Compare both the anticipated impact of the advisor and the cost. When you consider the impact on the company, it’s generally a very smart investment, even if it looks quite expensive. A good advisor can help you avoid expensive mistakes, and improve the trajectory of an entire department. That’s often worth it even if it’s quite expensive. Advisors can be expensive, and even feel out of reach for individuals. Why? Consider the economics of the situation. A highly experienced engineering leader can command a salary of $300-500k/year, plus 0.5-2% of the equity of a company, and bonuses. Total comp can be $350-$1MM a year. Advisors also have a lot of overhead to just be an advisor. To give you an idea from my own consulting: I have 4 or 5 hours of meetings with prospective clients for every client that signs up (and I think that’s a good ratio). I need to have about four hours of meetings a month just to be sustainable as a business, and it fluctuates greatly. I have to manage taxes, health insurance, and deal with unpaid vacation. When you add all these things up, the rates can end up being higher than you might expect. I personally provide a 50% discount for individuals, because otherwise many of them can’t afford it. And many of them find even the discounted rate out of their reach. If possible, go through your company. An easy way to ask about this is to ask if they offer executive coaches or executive advisors. If they don’t, ask if they’d be open to it. Many leaders don’t want to seem “weak” by asking for this. But many companies do offer this kind of investment in their employees, or are willing to. You may be surprised that the person you’re asking already uses an advisor! Most advisors charge either hourly, or by the number of meetings. My own rates are monthly charges, based on the frequency we plan to meet. That seems to be pretty common. What About Using an Executive Coach? Here’s what ChatGPT says is the difference between using a coach and an advisor. I have to admit it’s a good summary of the difference: “Both advisors and coaches can provide guidance and support, but there are some key differences between the two roles. An advisor typically has specialized knowledge or expertise in a particular area and offers advice based on that expertise. They may provide recommendations, opinions, or solutions based on their experience and knowledge to help a person make informed decisions. On the other hand, a coach is more focused on helping a person achieve their goals and develop their skills or abilities. Coaches typically ask questions and use various techniques to help their clients identify their own strengths and weaknesses and work towards self-improvement. Coaches may also help their clients create action plans and hold them accountable for their progress. Another key difference is that advisors tend to have a more directive approach, while coaches are more non-directive. Advisors often offer specific recommendations and may even make decisions for their clients, while coaches help their clients find their own solutions and strategies. Overall, both advisors and coaches can be valuable resources for people seeking guidance and support, but they have different areas of focus and approaches.” You’ll find that there is a continuum between the two. Most advisors also do some coaching. Some coaches also do some advising. My advice for engineering leaders is to find someone who leans towards advising. The reason for that is that having deep experience in engineering leadership is something that is worth a lot. There is a wide body of knowledge and expertise that you will be able to tap into from advisors that understand that domain well. But this does depend on the goals you have in mind. If you’re looking to be a more effective communicator, a coaching approach may be more appropriate. And if you’re wanting to have more perspective as a leader, a coach may be more appropriate. Both are reasonable options. I would look at this as a spectrum that you can use to evaluate people you’re considering.
As software engineers, we mainly talk about the power of tech skills and spending time learning new techniques. However, there is also the matter of influence that impacts your career as well. Why Should I Care About Influence? Usually, we only focus on the hard skill when discussing the software career path. That is a colossal mistake that several engineers, including me, have committed for a long time. In this post, I'll explain why influence is vital to your career. The first step on this post is to ensure we're on the same page. The discussion here is not to undervalue the tech perspective. Any engineer must have experience with software engineering, architecture, code design, and so on. The point is not to focus only on those aspects and forget the others. We mainly consult communication when discussing the post-senior career path, such as a Staff-Plus Engineer. The Staff Engineer's Path book by Tanya Relly explains how to get on the culture/organization map, mainly the trick terrain. Also, it includes the relationship between the engineering and product teams. The Tanya Relly book is an excellent book that contains the importance of influence, but this is not unique. There is also the Engineers Survival Guide: Advice, tactics, and Tricks, which has a chapter called "Influence Is Everything." This book was one of the first I read about branding for a software engineer. Being visible inside and outside the organization is crucial for your career. I hope I got your attention to how vital visibility is for any engineering career. Let's talk about how to raise your visibility inside and outside the organization. Let's start to apply it. Increasing Your Visibility When we discuss, visibility is vital to highlight that you need to be visible to both inside your organization and outside. When your organization knows what you know and what you're doing, it becomes easier to talk about your internal progress and promotion. However, the career belongs to you and not your company. With that in mind, find ways to work on your branding outside the organization. Sadly, some companies do not value you, and you need to be prepared to move, and when you have visibility around what you're doing, it becomes more accessible to archive a new job. In this post, we'll enumerate four options: Explore the company blog: Your company blog is an excellent place to share your status, share good practices, and help your colleagues in a single place. Team scalability and raising the quality bar are other points you can address using this place. Be kind: Everyone enjoys an accessible and helpful person; this is not an exception in an organization. Be prepared to help other colleagues with your knowledge, and be available to help. You're not scalable; thus, define a strategy to impact the most people possible with internal workshops, articles, and mentoring other engineers. Conferences: A conference is a place not only to share your knowledge and your expertise, but also learn and increase your networking. There are several options for meetings, such as in-person and online. Explore both because human contact is essential, and your new contact might be your following colleagues, but online is a time-saver on conferences. Use the time to get more people in your circle. Open source: It is only possible to talk about the ultimate stage of sophistication in software engineering by talking about open source. Being part of big open-source projects such as Inside Eclipse and Apache Foundation can help you learn the most advantageous technique to enhance your engineering team. From open source, you can learn from several companies to archive high-quality code, comprehensive documentation, strong collaboration, and a healthy culture. Besides this knowledge, open source will help visibility. This is differential; e.g., the number of people who contribute to JVM or Java language is less than the person who works with Java. Conclusion As software engineers, we usually talk only about hard skills; however, the influence matters in your career. Getting visibility inside and outside the organization is crucial. It is time to discuss more the branding and your credibility. There are several ways to increase this credibility, such as company posts, personal posts, conferences, open source, and more. I hope I helped you see this perspective and move your career to the next level. Please let me know your thoughts, if you're investing your time in being visible, and which methodologies you're doing.
Considering a lot of demand for the various skill sets of people for upgrading existing systems or for new projects, it is very challenging to prepare the best team for the project. A few successful processes could improve the situation; I have listed some of them below and hope they are helpful. Self-Awareness The most important aspect of learning is self-awareness. As a leader and an individual, the team must know the strengths and areas of improvement. Learning begins only when you are aware of your weakness. Once you know, there are weaknesses; the situation will never allow you to sit on them. Instead, it will continuously push you to take steps to come over weaknesses. Retrospective Once you are aware of strengths and weaknesses, you could take retrospect and find out more about the areas of improvement. The retrospective is a very important process about yourself, the team, and the existing process, and learning from it is a foundational requirement for success. It is important to have data on the strengths and weaknesses of your team so that you can define the best strategy and lead the team to success. Product Roadmap and Industry Trend As an individual and as a team, you could be in a perfect situation considering the current requirements of the industry or customer needs. However, the skill gap analysis must be done considering your future industry trends and customer needs. As an example, there is a buzz about AI/ML; it could be ChatGPT or any other similar trends. Assessing yourself in such contexts will give you a better sense of the skill gap. You and your team would be doing great with on-premises deployment architecture; however, the industry trend would be on cloud development. Questioning yourself on your team's skill set related to the cloud would give a sense of the skill gap. Training Approach 1. Data Collection and Identifying the Gaps Identifying the current skill set of your team through surveys, team meetings, and capturing the data, as explained below. By keeping yourself or a team member or a process in the center and analyzing the strength and weaknesses, list down strengths and weaknesses. The process can be as follows: Prepare the Roster (people or processes) Programming Skills Project Management Skills Soft skills Design and Architecture Status of the team member (Beginner, Developing, Mastered) for the skill. While preparing the list of skills to measure must consider two aspects: Current Requirements Future Requirements Update the status for all the team members. Once the skills level data is captured for all the team members, it is important to make them into buckets like current demand and future demand. 2. Define the Learning Paths Based on the categorization of the skill gaps, you can define the learning paths. Using your L&D tools, you can assign these learning paths to your team members to complete them. To make it effective implementation, reward and recognition is one of the important tools 3. Learning Partnership In case you are not equipped with content, you should identify a partner who can provide the content required for your skill gaps. 4. Learning Path Types Upskill The upskill learning path is mainly focused on learning new skill sets, but they are in the same domain of expertise. For example, the developers could be learning languages and tools required for the new requirements. Training is an important aspect of team development. We must have a predefined set of training for the team member's growth based on the skill gaps. We should continuously run them to be prepared the future requirements. Tools and technologies are changing so rapidly, and we need to keep team members ready for the new skills required by the customer. Training can be two methods: Internal External Every team usually has a few people enthusiastic trainers by nature; we should identify them and encourage them. They are generally very passionate about training. It will help both from an organizational perspective and also achieve efficacy in the training. The internal training concept will help to develop leadership and communication skills for the team members. To make the best use of external training, it is important to prepare the team on foundations so that the training can be utilized effectively. As part of internal training, we should encourage learning by motivating team members to learn and share concepts. This builds strong bondage among the team members. Once team members have understood the concepts and basics through our internal training, we could hire external trainers/specialists to train the advanced or specialized topics. With this approach, it will have two advantages: Team members are ready with basics and can absorb advanced topics easily. Less expensive and deep dive. Reskill The Reskill learning path is mainly focused on training your team members on the new domain of expertise. The leaders/Managers must try to identify potential team members who are ready for the next positions. They will be doing a fantastic job and achieving mastery in their current role. We must identify them and reskill them by providing the necessary training required for the next role. It will help them to understand the next role more effectively, and will be successful in the next role. Many times, the people who are successful in their current role will be failed in the next role. It is mainly because the requirements for the next role might not be the same as the current role. We must support them by providing the necessary training for the next role as well. Keeping the skill matrix up to date, upskilling, and reskilling the workforce will help embark on new projects with different skill set requirements. It will also help in building next-level leadership for the organization.
When it comes to online services, uptime is crucial, but it’s not the only thing to consider. Imagine running an online store — having your site available 99.9% of the time might sound good, but what if that 0.1% of downtime happens during the holiday shopping season? That could mean losing out on big sales. And what if most of your customers are only interested in a few popular items? If those pages aren’t available, it doesn’t matter that the rest of your site is working fine. Sometimes, being available during peak moments can make or break your business. It’s not just e-commerce — a small fraction of airports handle most of the air traffic, just a tiny minority of celebrities are household names, and only a handful of blockbuster movies dominate the box office each year. It’s the same distribution pattern everywhere. To be successful, it’s important to not only maintain uptime but also be ready for significant events. Some teams implement change freezes before key times, such as Prime Day, Black Friday, or Cyber Monday. This approach is reasonable, but it can be limiting as it doesn’t allow teams to quickly respond to unexpected opportunities or critical situations. Additionally, not all demand can be predicted, and it’s not always clear when those high-impact events will happen. This is where “Reliability when it matters” comes in. We need to be able to adapt and respond quickly to changes in customer demand without being held back by code freeze periods and being prepared for unforeseen situations. By considering time as a valuable resource and understanding the relative significance of different moments, organizations can better translate customer value and adjust risk and availability budgets accordingly. This approach allows organizations to be flexible and responsive to changes in demand without missing out on crucial features or opportunities. In the end, it’s about being ready when luck comes your way. It’s important to note that a system is not static and is constantly changing. The system itself, the infrastructure it’s hosted on, and the engineering organization all change over time. This means that knowledge about the system also changes, which can impact reliability. Besides that, incidents and outages are inevitable, no matter how much we try to prevent them. Bugs will be shipped, bad configurations will be deployed, and human error will occur. There can also be interdependencies that amplify outages. An incident rarely has a single cause and is often a combination of factors coming together. The same goes for solutions, which are most effective when they involve a combination of principles and practices working together to mitigate the impact of outages. Operating a system often means dealing with real-world pressures, such as time, market, and management demands to deliver faster. This can lead to shortcuts being taken and potentially compromise the reliability of the system. Growth and expansion of the user base and organization can also bring additional complexity and result in unintended or unforeseen behaviors and failure modes. However, by adopting a holistic approach and utilizing the principles and practices of engineering I’m going to cover below, we can have the best of both worlds — speed, and reliability. It’s not an either-or scenario but rather a delicate balance between the two. What Is Reliability? Reliability is a vital component of any system, as it guarantees not only availability but also proper functioning. A system may be accessible, yet if it fails to operate accurately, it lacks reliability. The objective is to achieve both availability and precision within the system, which entails containing failures and minimizing their impact. However, not all failures carry equal weight. For instance, an issue preventing checkout and payment is far more crucial than a minor glitch in image loading. It’s important to focus on ensuring important functions work correctly during critical moments. In other words, we want to focus on being available and functioning correctly during peak times, serving the most important functionality, whether it be popular pages or critical parts of the process. Making sure systems work well during busy times is tough, but it’s important to approach it in a thoughtful and thorough way. This includes thinking about the technical, operational, and organizational aspects of the system. Key parts of this approach include: Designing systems that are resilient, fault-tolerant, and self-healing. Proactively testing systems under extreme conditions to identify potential weak spots and prevent regressions. Effective operational practices: defining hosting topology, auto-scaling, automating deployment/rollbacks, implementing change management, monitoring, and incident response protocols. Navigating the competing pressures of growth, market demands, and engineering quality. Cultivating a culture that values collaboration, knowledge sharing, open-mindedness, simplicity, and craftsmanship. It also requires a focus on outcomes in order to avoid indecision and provide the best possible experience for customers. Further, we’re going to expand on the concept of “Reliability when it matters” and provide practical steps for organizations to ensure availability and functionality during critical moments. We’ll discuss key elements such as designing systems for reliability, proactively testing and monitoring, and also delve into practical steps like automating deployment and incident response protocols. Reliability Metrics: A Vital Tool for Optimization When optimizing a service or system, it's essential to initially define your objectives and establish a method for monitoring progress. The metrics you choose should give you a comprehensive view of the system’s reliability, be easy to understand, share, and highlight areas for improvement. Here are some common reliability metrics: Incident frequency: the number of incidents per unit of time. Incident duration: the total amount of time incidents last. While these metrics are a good starting point, they don’t show the impact of incidents on customers. Let’s consider the following graph: Blue — the number of requests per five minutes, Red — errors, Green — reliability in 0..1 Suppose we have two incidents, one at 1 am and one at 2 pm, each causing about 10% of requests to fail for an equal duration of 30 minutes. Treating these incidents as equally impactful on reliability wouldn’t reflect their true effects on customers. By considering traffic volume, the reliability metric can better show that an incident during peak traffic has a bigger impact and deserves higher priority. Our goal is to have a clear signal that an incident during peak traffic is a major problem that should be fixed. This distinction helps prioritize tasks and make sure resources are used effectively. For example, it can prevent the marketing team’s efforts to bring more visitors from being wasted. Additionally, tracking the incident frequency per release can help improve the deployment and testing processes and reduce unexpected issues. In the end, this should lead to faster delivery with lower risks. Digging Deeper Into Metrics To get a deeper understanding of these metrics and find areas for improvement, try tracking the following: Time to detection: how long it takes to notice an incident. Time to notification: how long it takes to notify relevant parties. Time to repair: how long it takes to fix an incident. Time between incidents: this can reveal patterns or trends in system failures. Action item completion rate: the percentage of tasks completed. Action item resolution time: the time it takes to implement solutions. Percentage of high-severity incidents: this measures the overall reliability of the system. Finally, regularly reviewing these metrics during weekly operations can help focus on progress, recognize successes, and prioritize. By making this a regular part of your culture, you can use the data from these metrics to drive better decisions and gradually optimize the system. Remember, the usefulness of metrics lies in the actions taken from them and their ability to drive progress. It’s a continuous feedback loop of refining both the data and the action items to keep the system improving. Designing for Resilience A system that isn’t designed to be resilient probably won’t handle peak times as smoothly. Here are some considerations that can help ensure a system’s reliability under a variety of conditions: Do’s: Prepare for component failure: By partitioning the service or using isolation, you can limit the blast radius and reduce the impact of failures. Implement fault-tolerance: Implementing mechanisms like retries, request hedging, and backpressure will improve the system’s availability and performance. Use rate-limiting and traffic quotas: Don’t rely solely on upstream dependencies to protect themselves. Use rate-limiting and traffic quotas to ensure that your system remains reliable. Categorize functionality: Prioritize functions by categorizing them into “critical,” “normal,” and “best-effort” categories. This will help keep essential functions available at all costs during high demand. Implement error-pacing and load-shedding: These mechanisms help prevent or mitigate traffic misuse or abuse. Continuously challenge the system: Continuously challenge the system and consider potential failures to identify areas for improvement. Plan for recovery: Implement fail-over mechanisms and plan for recovery in the event of a failure. This will help reduce downtime and ensure that essential services are available during challenging conditions. Make strategic trade-offs: Make strategic trade-offs and prioritize essential services during challenging external conditions. Dont’s: Don’t assume callers will use your service as intended. Don’t neglect rare but potential failures; plan and design prevention measures. Don’t overlook the possibility of hardware failures. I explored some of the ideas in the following blog posts: Ensuring Predictable Performance in Distributed Systems Navigating the Benefits and Risks of Request Hedging for Network Services FIFO vs. LIFO: Which Queueing Strategy Is Better for Availability and Latency? Isolating Noisy Neighbors in Distributed Systems: The Power of Shuffle-Sharding Reliability Testing Reliability testing is essential for maintaining the availability and functionality of a system during high demand. To ensure a reliable system, it is important to: Design for testability so each component can be tested individually. Have good enough testing coverage as a prerequisite for being agile. Calibrate testing by importance, focusing on essential functions and giving a bit of slack to secondary or experimental features. Perform extensive non-functional testing, such as load testing, stress testing, failure-injection testing, soak testing, and fuzzing/combinatorial testing. It’s crucial to avoid: Blindly pursuing high coverage numbers. Assuming that a single data point provides a comprehensive understanding. Ensure that results are robustly reproducible. Underinvesting in testing environments and tooling. Proper testing not only ensures correctness, serves as living documentation, and prevents non-functional regressions but also helps engineers to understand the system deeper, flex their creative muscles while trying to challenge them, and ultimately create more resilient, reliable systems for the benefit of all stakeholders. Remember, if you don’t deliberately stress test your system, your users will do it for you. And you won’t be able to choose when that moment comes. Reliability Oriented Operations Operating a distributed system is like conducting an orchestra, a delicate art that requires a high level of skill and attention to detail. Many engineers tend to underestimate the importance of operations or view it as secondary to software development. However, in reality, operations can have a significant impact on the reliability of a system. Just like a conductor’s skill and understanding of the orchestra is vital to ensure a harmonious performance. For example, cloud computing providers often offer services built on open-source products. It’s not just about using the software but how you use it. This is a big part of the cloud computing provider business. To ensure reliability, there are three key aspects of operations to consider: Running the service: This involves hosting configuration, deployment procedures, and regular maintenance tasks like security patches, backups, and more. Incident prevention: Monitoring systems in real-time to quickly detect and resolve issues, regularly testing the system for performance and reliability, capacity planning, etc. Incident response: Having clear incident response protocols that define the roles and responsibilities of team members during an incident, as well as effective review, communication, and follow-up mechanisms to address issues and prevent similar incidents from happening or minimize their impact in the future. The incident response aspect is particularly crucial, as it serves as a reality check. After all, all taken measures were insufficient. It’s a moment of being humble and realizing that the world is much more complex than we thought. And we need to try to be as honest as possible to identify all the engineering and procedural weaknesses that enabled the incident and see what we could do better in the future. To make incident retrospectives effective, consider incorporating the following practices: Assume the reader doesn’t have prior knowledge of your service. First of all, you write this retrospective to share knowledge and write clearly so that others can understand. Define the impact of the incident. It helps to calibrate the amount of effort needed to invest in the follow-up measures. Only relatively severe incidents require a deep process, do not normalize retrospectives by having them for every minor issue that doesn’t have the potential to have a lasting impact. Don’t stop at comfortable answers. Dig deeper without worrying about personal egos. The goal is to improve processes, not blame individuals or feel guilt. Prioritize action items that would have prevented or greatly reduced the severity of the incident. Aim to have as few action items as possible, each with critical priority. In terms of not stopping at the “comfortable answers,” it’s important to identify and address underlying root causes for long-term reliability. Here are a few examples of surface-level issues that can cause service disruptions: Human error while pushing configuration. Unreliable upstream dependency causes unresponsiveness. Traffic spike leading to the temporary unavailability of our service. It can be difficult to come up with action items to improve reliability in the long term based on these diagnoses. On the other hand, deeper underlying root causes may sound like: Our system allowed the deployment of an invalid configuration to the whole fleet without safety checks. Our service didn’t handle upstream unavailability and amplified the outage. Our service didn’t protect itself from excessive traffic. Addressing underlying root causes can be more challenging, but it is essential for achieving long-term reliability. This is just a brief overview of what we should strive for in terms of operations, but there is much more to explore and consider. From incident response protocols to capacity planning, there are many nuances and best practices to be aware of. The Human Factor in System Reliability While procedures and mechanisms play a vital role in ensuring system reliability, it is ultimately the humans who bring them to life. So, it’s not just about having the right tools but also cultivating the right mindset to breathe life into those mechanisms and make them work effectively. Here are some of the key qualities and habits that contribute to maintaining reliability (and not only): Collaboration with other teams and organizations in order to share knowledge and work towards a common goal. A degree of humility and an open-minded approach to new information in order to adapt and evolve the system. A focus on simplicity and craftsmanship in order to create evolvable and maintainable systems. An action-driven and outcome-focused mindset, avoiding stagnation and indecision. A curious and experimental approach akin to that of a child, constantly seeking to understand how the system works and finding ways to improve it. Conclusion Ensuring reliability in a system is a comprehensive effort that involves figuring out the right metrics, designing with resilience in mind, and implementing reliability testing and operations. With a focus on availability, functionality, and serving the most important needs, organizations can better translate customer value and adjust risks and priorities accordingly. Building and maintaining a system that can handle even the toughest conditions not only helps drive business success and pleases customers but also brings a sense of accomplishment to those who work on it. Reliability is a continuous journey that requires attention, skill, and discipline. By following best practices, continuously challenging the system, and fostering a resilient mindset, teams, and organizations can create robust and reliable systems that can withstand any challenges that come their way.
TL; DR: ChatGPT 4: A Bargain for Scrum Practitioners? When OpenAI released its new LLM model GPT-4 last week, I could not resist and signed up for $20 monthly. I wanted to determine whether ChatGPT 4 is superior to its predecessor, which left a good impression in recent months; see my previous articles on Scrum, Agile, and ChatGPT. I decided to run three comparisons, using the identical prompt to trigger answers from the new GPT-4 and previous GPT-3.5 models. Read on and learn what happened. It was not a foregone conclusion. Setting up the Session I ran the session on ChatGPT on Wednesday, March 15, from 5:40 am CET to 6:30 am CET on the models GPT-3.5 and GPT-4 after upgrading to ChatGPT Plus. I did not apply Grammarly to ChatGPT’s answers. The Initial ChatGPT Prompt to Explore the New Mode GPT-4 I used the following initial prompt to prime ChatGPT: “Are you using the same training data as GPT-3.0 or 3.5?” ChatGPT-4.0: As a GPT-4 model, I am an updated version of the previous GPT models. While I share some similarities with GPT-3 and GPT-3.5, there are differences in the training data used for each model. The primary difference is the time period of the data. GPT-3 and GPT-3.5 were trained on data available up until around 2020 and 2021, respectively. In my case, as GPT-4, my training data extends until September 2021. This allows me to have a more up-to-date understanding of the world, including newer events and developments. Additionally, improvements in data curation, preprocessing, and filtering techniques have likely been made between versions, leading to differences in the quality and breadth of the training data. While the core data sources might be similar, such as books, articles, and websites, the data sets themselves would not be identical across these models. The Experiments Now that we have a better understanding of the model’s background let’s run three experiments to figure out which model is superior. I will run three comparisons or A/B tests, using the identical prompt to trigger answers from the new GPT-4 and previous GPT-3.5 models: The first experiment compares the Scrum Guide 2017 to the Scrum Guide 2020 The second experiment is designing a Retrospective exercise The third experiment is to help a new Scrum team to design a Definition of Done. Learn more about the new GPT-4 model from OpenAI’s GPT-4 Developer Livestream of March 14, 2023, or the OpenAI GPT-4 Technical Report. Experiment 1: Comparing the Scrum Guide 2017 to the Scrum Guide 2020 Prompt: I want you to act as a Scrum Master. Tell me: what is the difference between the Scrum Guide 2017 and the Scrum Guide 2020? ChatGPT-3.5: As a Scrum Master, I’d be happy to answer your question about the differences between the Scrum Guide 2017 and the Scrum Guide 2020. The Scrum Guide is the official guide to Scrum, and it provides a framework for implementing and improving Scrum practices. The latest version of the Scrum Guide was released in November 2020, and it contains several updates from the previous version released in 2017. Here are some of the key differences between the two versions: Focus on the Scrum team: The Scrum Guide 2020 places more emphasis on the Scrum team, rather than just the Development Team. The Product Owner and Scrum Master are now considered equal members of the Scrum Team. Empirical process control: The Scrum Guide 2020 highlights the importance of empirical process control and the three pillars of transparency, inspection, and adaptation. Updated language: The language in the Scrum Guide 2020 has been updated to be more inclusive and gender-neutral. Simplified framework: The Scrum Guide 2020 has removed some of the prescriptive language from the previous version to make the framework more flexible and adaptable to different contexts. Refinement instead of grooming: The term “Backlog Refinement” is now used instead of “Backlog Grooming” to emphasize the continuous improvement aspect of the practice. Overall, the changes in the Scrum Guide 2020 are intended to make the framework more accessible and adaptable to a wider range of contexts while still maintaining the core principles and values of Scrum. Comment: These answers are irrelevant, or hallucinations: Scrum Master and Product Owner are now “equal members of the Scrum?” No mention of commitments, namely the Product Goal? Highlighting the importance of empirical process control? That has been at the core of Scrum from its beginnings. The answer is embarrassing. ChatGPT-4.0: As a language model, I can provide you with the key differences between the Scrum Guide 2017 and the Scrum Guide 2020. Here are some of the most notable changes: Less prescriptive and more flexible: The Scrum Guide 2020 is shorter and more focused on the Scrum framework’s core concepts. It has reduced the prescriptive nature of some practices, allowing teams more flexibility in adopting Scrum according to their specific context. Introduction of the Product Goal: The 2020 version introduces the concept of a Product Goal, which is a long-term objective for the product. It provides guidance and direction for the team, and all the Sprint Goals should align with and contribute to the Product Goal. Simplified team structure: The Scrum Guide 2020 eliminates the distinction between development team members and other roles. It now refers to the whole team as the “Scrum Team,” consisting of the Scrum Master, Product Owner, and Developers, all of whom share accountability for delivering a potentially releasable increment each sprint. A greater focus on commitment: The Scrum Guide 2020 emphasizes the importance of commitment for each artifact. The Product Backlog has the Product Goal, the Sprint Backlog has the Sprint Goal, and the Increment has the Definition of Done. These commitments help provide clarity and focus for the Scrum Team. Self-managing instead of self-organizing: The 2020 version refers to Scrum Teams as self-managing, emphasizing their autonomy in making decisions regarding how to achieve the Sprint Goal, and how to create the product Increment. This change highlights the team’s responsibility for organizing and managing their own work, rather than relying on external direction. Removal of the Sprint Retrospective’s three-question format: The 2020 Scrum Guide no longer suggests the specific three-question format (“What went well? What can be improved? What will we commit to improve in the next Sprint?”) for the Sprint Retrospective. Instead, it emphasizes the need to plan improvements and foster a continuous improvement mindset. Updated the Definition of Done: The Scrum Guide 2020 clarifies that the Definition of Done is a formal description of the state of the Increment when it meets the quality measures required for the product. It also emphasizes that the Definition of Done is used to assess when work is complete on the product Increment. These are some of the main differences between the Scrum Guide 2017 and the Scrum Guide 2020. The 2020 version aims to provide a more straightforward, flexible, and focused framework for Scrum practitioners, emphasizing the importance of team collaboration, commitment, and continuous improvement. Comment: We have a clear winner! GPT-4’s answer is better and more substantial than the previous model’s. Score: ChatGPT-4 1-0 ChatGPT-3.5 Experiment 2: Designing a Retrospective Exercise Prompt: I want you to act as an experienced Scrum Master. Your Scrum team had a problematic Sprint and missed accomplishing the Sprint Goal. Some team members had unexpected food poisoning. The procurement department failed to renew credit card information for an essential service your team uses, and that service was no longer available. Additionally, the Scrum team ran into unexpected technical debt hidden in the legacy part of the code base. The tension in the team is high; arguments have become more frequent. As a Scrum Master, I want you to use your knowledge of Scrum, iterative and incremental product development, non-violent communication, collaboration, team building, Scrum values, psychology, and Liberating Structures to design a Retrospective that helps the Scrum team to overcome recent setbacks and improve the relationships among each other. ChatGPT-3.5: I’m sorry to hear that your Scrum team had a challenging Sprint. As an experienced Scrum Master, I’m here to help you navigate this difficult situation and design a Retrospective that will help the team overcome recent setbacks and improve relationships among team members. First, it’s essential to acknowledge that setbacks and challenges are a natural part of the Scrum process. Our goal is not to eliminate problems but to learn from them and use them as opportunities for growth and improvement. To start the Retrospective, we need to create a safe and respectful space where team members can express themselves honestly and openly. One Liberating Structure that could help us achieve this is the “1-2-4-All” structure, which involves individuals reflecting silently, sharing with a partner, joining a group of four to discuss their thoughts, and then sharing the group’s ideas with the entire team. We’ll start by asking the team to reflect on the past Sprint and identify what went well and what didn’t go as planned. We’ll then focus on the challenges that arose during the Sprint, including the unexpected technical debt, credit card issues, and team members getting sick. Once we have identified the challenges, we’ll use the Liberating Structure “TRIZ” to explore potential solutions. TRIZ stands for “Theory of Inventive Problem Solving,” and it’s a structured approach to generating creative solutions to complex problems. We’ll ask the team to identify the root cause of each problem and then use TRIZ to generate new ideas for addressing those issues. Next, we’ll use the Liberating Structure “Celebrity Interview” to encourage empathy and understanding among team members. Each team member will take on the role of a “celebrity” who embodies certain values or characteristics, and the team will ask them questions about how they would handle the recent setbacks. Throughout the Retrospective, we’ll emphasize the importance of non-violent communication and collaboration, and we’ll remind the team of the Scrum values of commitment, courage, focus, openness, and respect. By the end of the Retrospective, our goal is to have identified concrete steps that the team can take to address the challenges that arose during the previous Sprint and improve their relationships with each other. We’ll create an action plan that outlines these steps and assign owners to each task to ensure accountability and follow-through. Comment: Sounds good, initially, doesn’t it? Too bad the suggestion does not pass the test. Of course, you could use 1-2-4-All to start the Retrospective. However, if every team member shall have their airtime, a structure like “Conversation Café” would be better suited. Instead of TRIZ, I would use “Nine Whys” or “What. So What. Now What?” Finally, the “Celebrity Interview” is useless for the intended purpose. If empathy is your main concern, “Heard, Seen, Respected” would be a better choice. ChatGPT-4.0: As an experienced Scrum Master, I would design a Retrospective that not only addresses the setbacks but also focuses on rebuilding trust, improving communication, and fostering a collaborative environment within the team. The Retrospective will follow these five steps: Set the Stage: Start the session by acknowledging the challenges the team faced during the Sprint and emphasize the importance of learning from setbacks. Encourage everyone to participate in a safe, open, and respectful environment. Use a short icebreaker activity to help the team relax and prepare for the session. Gather Data: Begin by asking the team members to write down on sticky notes what went well, what didn’t go well, and any surprising events during the Sprint. Use a timeline activity to help visualize the events and identify patterns. Ask the team to share their experiences openly, being mindful of non-violent communication principles. Generate Insights: Use the ‘5 Whys’ technique or the ‘Fishbone’ diagram to analyze the root causes of the problems faced. Encourage the team to focus on the process and not on individuals. Emphasize Scrum values, such as courage and openness, while discussing the issues. Facilitate a discussion on the impact of the unexpected technical debt and how to address it moving forward. Also, discuss how to improve communication with the procurement department to avoid future service disruptions. Decide What to Do: Guide the team to collaboratively identify action items to address the root causes of the problems. Prioritize the action items by having the team vote on the most important ones. Encourage the team to come up with both short-term and long-term actions. For example, a short-term action could be to create a checklist for procurement processes, while a long-term action could involve implementing automated tests to uncover hidden technical debt. Close the Retrospective: End the session with a team-building activity that reinforces the Scrum values and fosters collaboration. Ask the team members to share one positive thing about their colleagues or to express gratitude for something a teammate did during the Sprint. This will help to improve relationships and reduce tensions among the team members. Conclude the Retrospective by summarizing the key takeaways and the agreed-upon action items, and reiterating the importance of continuous improvement and collaboration. Follow up with the team in the upcoming Sprints to ensure that the action items are being implemented and to gauge the effectiveness of the improvements. Encourage the team to maintain open communication and support each other as they work towards achieving their goals. Follow-up Prompt: Is it necessary to run the Retrospective in this strict, prescriptive manner? ChatGPT-4.0: While the five-step structure I provided is a popular and effective way to conduct Retrospectives, it is not mandatory to follow it in a strict, prescriptive manner. As a Scrum Master, you should be flexible and adapt the format based on your team’s needs, preferences, and specific situation. It is essential to maintain the core principles of a Retrospective, which include creating a safe and open environment for discussion, reflecting on the past Sprint, generating insights, identifying improvements, and fostering a sense of collaboration and continuous improvement within the team. You can experiment with different techniques and activities to achieve these objectives, and it is common for experienced Scrum Masters to tailor Retrospectives to suit their team’s dynamics. Some alternative approaches include: Lean Coffee: A structured, agenda-less meeting format that allows participants to democratically decide on the topics they want to discuss. World Café: A technique that uses rotating small-group discussions to gather input from all team members on multiple topics. Starfish: A visual method for capturing what the team should start doing, stop doing, continue doing, do more of, and do less of. The key is to be receptive to your team’s feedback and continuously improve the Retrospective process to ensure it remains valuable and engaging for everyone involved. Comment: The first suggestion was merely a standard application of the Esther Derby and Diana Larsen model, ignoring the additional information on the situation from the prompt. If we also consider the answer to the follow-up prompt, the GPT-4-based performance gets a bit better, although it is mediocre at best. However, such a performance is sometimes sufficient when the competitor’s answer is even worse. Score: ChatGPT-4 2-0 ChatGPT-3.5 Experiment 3: Help a New Scrum Team to Design a Definition of Done Prompt: I want you to act as an experienced Scrum Master. The Definition of Done is essential for a Scrum team’s success. It defines the quality standard any Increment is expected to meet by stakeholders and governance bodies. The quality level does not have to be perfect or top-of-the-line in all situations. However, it is critical that once an Increment is “done,” it can be delivered to the customers without any legal, ethical, or financial repercussions. I want you to use your knowledge of Scrum, iterative and incremental product development, collaboration, product management, stakeholder collaboration, quality management, innovation, and business agility in general to suggest a process on how a new Scrum team can create an initial Definition of Done and then maintain it accordingly. ChatGPT-3.5: Sure, I’d be happy to help you with that. Creating an Initial Definition of Done: Collaborate with the team: As a Scrum Master, you need to work with the team to create a Definition of Done that everyone agrees upon. You can facilitate a discussion during a Sprint Retrospective or a dedicated Definition of Done workshop to ensure everyone’s voices are heard. Understand stakeholders’ expectations: It’s essential to understand what the stakeholders expect from the product and what quality standards they have in mind. The team should incorporate these expectations into the Definition of Done, which will ensure that the product meets stakeholder expectations. Identify quality metrics: Define the metrics that can be used to measure quality. Metrics like code coverage, performance, and security can help establish the Definition of Done. The team should strive to meet these metrics when delivering a potentially releasable Increment. Consider technical debt: Teams should consider the technical debt they’re willing to take on when developing a feature. Technical debt is the cost of rework caused by taking shortcuts or delaying work. The team should define the level of technical debt they’re willing to accept and communicate this to stakeholders. Keep it simple: The Definition of Done should be concise, clear, and easy to understand. It should be written in a language that’s accessible to all team members and stakeholders. Maintaining the Definition of Done: Review and update the Definition of Done regularly: The Definition of Done should be reviewed and updated regularly. This can be done during Sprint Retrospectives or dedicated Definition of Done workshops. The team should consider feedback from stakeholders and make necessary changes to the Definition of Done. Communicate changes to stakeholders: Whenever the Definition of Done changes, the team should communicate this to stakeholders. This will help ensure that everyone is on the same page and that the product meets expectations. Use the Definition of Done during Sprint Reviews: During Sprint Reviews, the team should use the Definition of Done to determine whether the Increment is ready for release. If it doesn’t meet the Definition of Done, the team should identify the gaps and work to close them in the next Sprint. Track and report progress: Teams should track and report progress against the Definition of Done metrics. This can help identify areas where improvement is needed and drive continuous improvement. Use feedback to improve: The team should use feedback from stakeholders to improve the Definition of Done. This can be done by incorporating feedback into the Definition of Done or adjusting the metrics used to measure quality. Overall, creating and maintaining a Definition of Done is critical to the success of a Scrum team. By following these steps, the team can ensure that they’re delivering quality Increments that meet stakeholder expectations and can be released without any legal, ethical, or financial repercussions. Comment: Wow, who would have thought ChatGPT-3.5 could deliver at such a high level? The approach is collaborative and inclusive, technical debt is part of the consideration, and the Definition of Done shall be simple and regularly inspected and adapted, where feedback from stakeholders is as valuable as data from tracking suitable metrics. ChatGPT-4.0: As an experienced Scrum Master, I recommend the following process for a new Scrum team to create an initial Definition of Done (DoD) and maintain it: Gather Inputs: Start by collecting inputs from the Scrum team, stakeholders, governance bodies, and any existing organizational standards or guidelines. These inputs will help establish the baseline quality expectations and constraints for the product. Educate the Team: Ensure that all team members understand the purpose and importance of the DoD. Explain how it helps create transparency, manage expectations, and ensure a potentially releasable product Increment. Collaborative Workshop: Organize a workshop with the Scrum team, Product Owner, and relevant stakeholders to collaboratively create the initial DoD. Facilitate discussions on the quality standards and acceptance criteria, including functional and non-functional requirements, performance, security, usability, and compliance. Draft the Definition of Done: Based on the workshop outcomes, create a draft of the DoD that includes a clear and concise list of criteria that must be met for an Increment to be considered “done.” Ensure that the criteria are measurable, achievable, and aligned with the overall product vision. Review and Refine: Present the draft DoD to the Scrum team, Product Owner, and stakeholders for review and feedback. Make any necessary adjustments based on their input, and iterate until a consensus is reached. Communicate and Display: Once the initial DoD is finalized, communicate it to all relevant parties and display it prominently in the team’s workspace or on a shared online platform. This ensures that everyone is aware of the expectations and can refer to the DoD when needed. Regularly Review and Update: Incorporate a review of the DoD into the team’s Retrospectives, or schedule periodic reviews to ensure it remains relevant and aligned with the product’s evolution. Update the DoD as needed, based on lessons learned, changing requirements, stakeholder feedback, and any changes in organizational standards or regulations. Monitor Compliance: As a Scrum Master, monitor the team’s adherence to the DoD throughout each Sprint. Provide support and guidance to help the team meet the established criteria and address any challenges that arise. By following this process, a new Scrum team can create an initial Definition of Done and maintain it to ensure consistent quality and alignment with stakeholder expectations throughout the product development lifecycle. Comment: Gee, this is unexpected. It sounds like a PMI brochure. Follow-up Prompt: Your suggestion contains too much project management thinking and ignores the importance of collaboration and teamwork in a Scrum team. Partly, it is plain wrong: A Scrum Master does not create the initial Definition of Done. Also, they do not enforce compliance; that is the task of the Developers. Recreate the answer based on Scrum’s first principles of self-management and servant leadership. ChatGPT-4.0: I apologize for any confusion in my previous response. As a Scrum Master, your role is to facilitate and guide the team through the process of creating and maintaining the Definition of Done (DoD), while emphasizing the importance of collaboration, teamwork, and self-management. Here’s an updated approach based on Scrum’s first principles: Set the Context: Ensure that the Scrum team, including the Product Owner, understands the purpose and importance of the DoD. Explain how it helps establish a shared understanding of quality standards, manage expectations, and ensure a potentially releasable product Increment. Facilitate Collaboration: Encourage the Scrum team to collaborate with stakeholders, governance bodies, and any existing organizational standards or guidelines to gather inputs that will help establish the baseline quality expectations and constraints for the product. Empower the Team: Support the Developers in organizing a workshop or a series of discussions where they can collaboratively create the initial DoD. Encourage them to consider functional and non-functional requirements, performance, security, usability, and compliance. Refinement and Consensus: As the Developers draft the DoD, facilitate discussions and feedback loops between the team, Product Owner, and stakeholders. Help them iterate on the DoD until they reach a consensus on the criteria that must be met for an Increment to be considered “done.” Encourage Transparency: Once the initial DoD is finalized, encourage the Developers to communicate it to all relevant parties and display it prominently in the team’s workspace or on a shared online platform. This will ensure that everyone is aware of the expectations and can refer to the DoD when needed. Foster Continuous Improvement: During Retrospectives, remind the team to review the DoD and discuss any potential improvements or adjustments based on lessons learned, changing requirements, stakeholder feedback, or changes in organizational standards or regulations. Support Compliance: As a servant leader, provide support and guidance to help the Developers adhere to the DoD throughout each Sprint. Encourage open communication and address any challenges that arise. Remember that the responsibility of meeting the DoD criteria lies with the Developers, not with the Scrum Master. By following this approach, you can enable the Scrum team to create and maintain a Definition of Done that is rooted in collaboration, self-management, and continuous improvement, while staying true to the principles of servant leadership. Comment: ChatGPT-4.0 talks a lot about collaboration, self-management, and continuous improvement. However, it seems that the Scrum Master is still in charge of pushing the Developers in the right direction from the Scrum Master’s perspective. Honestly, I like ChatGPT-3.5’s answer better. PS: The Product Owner is also a member of the Scrum team. Score: ChatGPT-4 2-1 ChatGPT-3.5 Conclusion ChatGPT based on the GPT-4 model has advantages over the “old” GPT-3.5 model. OpenAI promised better reasoning and conciseness and delivered. However, judging by three small everyday experiments, the new model’s advantage is not as conclusive as expected. So, let’s wait for GPT-5. Nevertheless, I will continue my subscription. What is your opinion as a Scrum Practitioner: Is ChatGPT-4.0 worth spending $20 per month compared to ChatGPT-3.5? Please share it with us in the comments.
Effective project management is essential in software development. Jira’s Epics provide a powerful solution for managing complex software projects, and understanding how to use them can make all the difference. In this article, we’ll dive deep into the technical aspects of utilizing Epics in the Jira workflow, exploring the advanced features and Jira best practices that can streamline your software development process. So, let’s delve into the intricacies of agile project management in software development and take your workflow to the next level. Example of an Epic In essence, an Epic is a collection of user stories that groups them together into one valuable item. This helps the team, Project Managers, and Stakeholders see the bigger picture and the overall strategy of the project without needing to dive into the details. Let’s say you run a general goods store. There are dozens of daily activities you need to keep tabs on: keeping the inventory in check, making sure new products are labeled properly on the shelves, cleaning the aisles, etc. These are your “stories.” Putting all of them into one gargantuan to-do list doesn’t make a lot of sense as you’ll end up with a lot of checklist items of various value and complexity. This will make prioritization and even simple analysis of the workload way more challenging than it needs to be. However, if you group them together using a common denominator, you’ll have a much more manageable list. In this case, related tasks like an inventory check, ordering a new delivery, and paying the vendor will become an Epic for your procurement process. You can then use the newly created Epics to prioritize your work based on the following factors: The impact an Epic has on your business. The effort or the amount of work you’ll need to complete the tasks. The risks that are associated with Epic like, for example, the risk of not having full information, the risk of being a bad market fit, or the risk of not meeting regulations, etc. Any dependencies with other tasks or Epics. An example that’s closer to software development can involve the addition of Augmented Reality functionality to a mobile app of a beauty product vendor. You’ll need the involvement of multiple people working on various tasks to accomplish your goal. Therefore it would make sense to group them together into an Epic. What’s the Right Size for an Epic? The appropriate size for an Epic is somewhat of a hot topic in the development community. You won’t find the “correct” answer in the Scrum Guide or in any other reference material. I’ve heard arguments that one Epic should be doable in one Sprint, and I’ve also heard that Epics can take as long as several years. In both cases, the justification was something along the lines of, “Hey, it works for our team; therefore, it is a good practice.” In my experience, several years is a very long time to complete an Epic. This timeframe is closer to themes and initiatives in terms of volume and complexity. A single Sprint time frame doesn’t make too much sense to me, either. This is enough time to make a feature, and a feature is indeed composed of multiple stories, but I don’t believe Epics should be feature-sized. I’m more inclined to agree with Mr. Chris Belknap and his estimation of how big an Epic should be: A theme is a multi-year endeavor. Epics take from six months to a year. Features take around three months. And stories take one Sprint or less. That said, at the end of the day, the “correct” size of an Epic is whatever works for you and your organization. The key is making sure that everyone, including the stakeholders, managers, and team members, understands what an Epic is in your case. Difference Between Epics, User Stories, Themes, and Initiatives By default, Jira has three elements to its hierarchy: Epics, Issues (Stories), and Subtasks. You can add additional layers to this functionality with Themes and Initiatives. The latter is only available for Jira Premium or third-party add-ons. Initiatives Initiatives are like epics for other epics, providing a strategic bird’s eye view for overseeing multiple company projects. For instance, an initiative could aim to boost a certain project’s market share in Spanish-speaking countries, such as an e-commerce store. This initiative would then be broken down into smaller projects like establishing connections with new vendors, investing more in advertising and marketing, or developing a mobile app for the store. Themes Themes are labels used to group work by categories in Jira. For example, a theme might involve adding a Spanish localization to reach a new market. Epics Epics serve as a project management tool to break down complex work, prioritize tasks, plan increments, and track progress among one or more teams working on the project. Epics are commonly used by developers to define project features. For instance, when creating an e-commerce website, account management, shopping cart functionality, and PayPal integration could be considered Epics. Each task required to launch the website becomes an issue or story associated with these Epics. It’s important to note that an Epic does not necessarily represent the entire project, as large projects are often divided into multiple Epics. For instance, filling website pages with content or improving SEO could be separate Epics in the e-commerce website example. Issues Issues refer to different types of tasks like Bugs or Stories, which form smaller features within an Epic. In this regard, a feature is the smallest element of your product that the customer can use or interact with to gain value. For instance, a customer logging into your product is a feature that can have multiple stories: log in with email, log in with social media, etc., whereas the login, password reset, and deleting an account would form an “Account Management” Epic together. Pro tip: You can use a Jira Checklist to refine your requirements through a more detailed decomposition of tasks. They can be also helpful when you have some research items, as you will have a list of items that specifies the things you want to check. Subtasks Subtasks are a way to segment work and further breakdown issues in Jira. How to use Jira subtasks? If there’s a story for deleting an account, subtasks could include creating a confirmation screen, sending a verification email, performing background clean-up according to GDPR rules, and ensuring adequate test coverage. How Are Epics Used in Agile? The Agile approach to Project management encourages teams to break down their work into manageable segments or Sprints. This approach helps with prioritization and makes sure developers are lean enough to incorporate feedback and respond to change. Epics are an element of the hierarchy that’s designed to organize the smallest pieces of work and track progress at a somewhat larger scale. Their role in Agile can be broken down into three parts: Break down the project to make the work more manageable. Align the stakeholders with the product and your vision. Highlight dependencies. These goals are achieved when Epics are visualized on a roadmap. This roadmap is similar to a Gantt chart but simpler. It offers a high-level overview of the scope by showing you the Epics (1), associated stories (2), and the time frame. You can also clearly see dependencies (3) on the roadmap. Check out this guide that’s dedicated to the Jira roadmap to learn more. Jira extends your abilities for management and planning with a wide selection of accessible reports. The three that will help you with managing work and Epics are: 1. Burnup Report This report shows the actual number of story points that were burned during a Sprint versus the scope of work that indicates the total number of story points that must be completed. 2. Velocity Report The velocity report compares your commitment or the amount of work that was present when the work began VS the work that has been completed. 3. Cumulative Flow Diagram The Cumulative flow diagram shows you all of the tickets based on the workflow status. You can use this to learn more about potential bottlenecks in your processes; for example, if too many tickets are piling up with the ready-for-QA status, perhaps you need more quality assurance engineers, etc. How To Create a New Epic Creating a new Epic in Jira is actually quite easy. You need to open the Roadmaps interface in your Jira and click the Create Epic button. You can then add new stories to your epic by clicking on the “+” button and using the drag-and-drop interface to adjust the time frame. How To Structure an Epic The tricky part is that most of the project planning work needs to be done before you even think about opening Jira. Our team uses the BRIDGeS framework for multi-context analysis. This planning stage can take some time, and it will require involvement from the whole team, but the entire process can be broken down into 4 major elements: 1. Problem Description Conduct a brainstorming session to identify key subjects and their descriptors. Ensure that all team members have a thorough understanding of the context before moving forward. 2. Prioritization Exercise Prioritize benefits, risks, and issues to focus on the most important tasks during the Solution Stage. This will prevent time wastage on less critical tasks. 3. Solution Variations Generate potential high-level solutions that address the benefits, risks, and issues identified in the Problem Space. 4. Solution Breakdown Break down the high-level solution into smaller, manageable bodies of work. These will become your list of Epics that are already designed within a roadmap-like implementation plan. Translate this plan into Jira. You can learn more about the framework in more detail from this guide to project management in Jira. Tips and Best Practices for Jira Epics Lastly, here are some handy tips for working with Epics and Roadmaps in Jira. Build Epics around OKRs or Objectives and Key Results. This way, the team will know what needs to be done and why. Define the completion of an Epic as well as the Definition of Done and Acceptance Criteria beforehand. You can also use checklist templates to help with automating this process. Don’t make your Epics too big. There is always the temptation of adding new tasks to an Epic simply because it’s already there. Close your Epics when they are actually done rather than when all of the tasks are closed. It is ok to leave open issues in an otherwise completed Epic if they’ve lost their relevance or have been deprioritized. Revise the roadmap on a regular basis. My team has a practice of revisiting the roadmap on a weekly basis. This helps us communicate and visualize blockers in a timely and efficient manner. Be realistic with your scope. Having more tasks than your team can realistically deliver will cause fast burnout, and it will make the working process much harder than it needs to be simply due to the need to juggle tasks. Clearly outline your dependencies. This will make prioritization much simpler, and the bigger picture will be more visible to the stakeholders. Take a clear look at issues without Epics during your backlog grooming sessions.
Data engineering is a rapidly evolving field that is constantly challenged by the increasing volume, velocity, and variety of data being generated and processed by organizations. Traditional data engineering approaches are often centralized and monolithic, which can lead to challenges in scalability, agility, and flexibility. In recent years, a new architectural paradigm called Data Mesh has emerged as a novel way to address these challenges and enable more efficient and effective data engineering. Data Mesh is a distributed and domain-oriented data architecture that advocates for a paradigm shift in how data engineering is approached within organizations. It was first introduced by Zhamak Dehghani, a thought leader in the data engineering community, and has gained significant attention as a promising approach to modern data engineering. At the core of Data Mesh is the concept of domain-oriented ownership, where data engineering responsibilities are distributed across cross-functional teams based on domain expertise rather than being centralized in a single team. This means that each team takes ownership of the data for a specific domain, such as customer data, product data, or financial data, and is responsible for the end-to-end data lifecycle, including data ingestion, processing, storage, and consumption. One of the key principles of Data Mesh is the concept of self-serve data infrastructure, which empowers domain teams to independently manage their data without having to rely heavily on central data engineering teams. This is achieved through the use of platform thinking, where domain teams are provided with a set of shared data infrastructure components, tools, and services that they can use to build their own data pipelines, data lakes, and data applications. Another important aspect of Data Mesh is the use of product thinking in data engineering. This means treating data pipelines and data products as first-class citizens with similar rigor and practices as software products. Domain teams are encouraged to think in terms of data products that are designed to serve specific data consumers, such as data scientists, analysts, and business users. This approach promotes a product mindset, where data engineering is seen as a product development process that involves continuous iteration, feedback loops, and customer-centric thinking. Data Mesh also emphasizes the use of domain-driven design (DDD) principles, which aligns well with the domain-oriented ownership concept. DDD is a software design approach that focuses on understanding and modeling the domains of a system, and Data Mesh extends this concept to data engineering. Domain teams are encouraged to define clear boundaries and interfaces for their data domains and to use domain-specific language and concepts when designing their data pipelines and data products. This helps to ensure that data is treated as a first-class citizen within each domain and that data is modeled and processed in a way that aligns with the specific needs of the domain. One of the benefits of Data Mesh is improved scalability and agility. By distributing data engineering responsibilities across domain teams, organizations can leverage the expertise and knowledge of these teams to develop and manage data pipelines more efficiently. As a result, domain teams are closer to the data and the business context. This allows them to make faster decisions, iterate on data products more rapidly, and respond to changing business requirements with greater agility. Data Mesh also promotes a culture of data ownership and data collaboration. By giving domain teams ownership of their data, Data Mesh encourages a sense of accountability and responsibility toward data quality, data privacy, and data governance. Domain teams are also encouraged to collaborate with other teams, both within and outside their domain, to ensure that data is integrated, validated, and transformed in a consistent and coherent manner across the organization. This culture of data ownership and collaboration helps to foster a data-driven culture within the organization and promotes better data practices. Another benefit of Data Mesh is improved data democratization. By providing domain teams with self-serve data infrastructure, organizations can empower a broader set of users, including data scientists, analysts, and business users, to access and analyze data more easily. This democratization of data allows for faster and more informed decision-making across the organization. Domain teams can also tailor their data products to the specific needs of their data consumers, leading to more relevant and actionable insights. In addition, Data Mesh enables organizations to leverage the best tools and technologies for each domain. Since domain teams have autonomy in choosing their data infrastructure components, they can select the best-fit tools and technologies that align with their domain's requirements. This promotes innovation and flexibility in data engineering, allowing for the adoption of cutting-edge technologies and practices that can drive better data outcomes. Data Mesh also promotes a DevOps mindset in data engineering. Domain teams are responsible for the entire data lifecycle, from ingestion to consumption, which includes monitoring, testing, and deployment of data pipelines and data products. This encourages a DevOps culture where data engineers work closely with data operations (DataOps) teams to ensure that data products are developed, tested, and deployed in a reliable and automated manner. However, implementing Data Mesh also comes with challenges. One of the main challenges is the need for cultural and organizational change. Shifting from a centralized data engineering approach to a domain-oriented ownership model requires changing mindset, culture, and organizational structure. It may also require changes in roles and responsibilities and redefining processes and workflows. Therefore, organizations need to invest in training, education, and change management efforts to ensure the smooth adoption of Data Mesh. Another challenge is the complexity of managing distributed data pipelines and data products. With domain teams having autonomy in designing and managing their data infrastructure, there may be a need for standardization, documentation, and governance to ensure consistency, reliability, and security of data. Organizations need to establish clear guidelines, standards, and best practices to ensure that domain teams adhere to common data engineering principles while still having the flexibility to innovate. Implementing Data Mesh architecture requires careful planning, coordination, and a step-by-step approach. Here are some key steps to consider when implementing Data Mesh: Define Domain-Oriented Ownership Identify and define the different domains within your organization that are responsible for specific data products or areas of expertise. This could be based on business functions, departments, or specific data domains. Assign domain ownership to respective teams and clearly define their responsibilities, authority, and accountability for data products within their domain. Foster a Product Thinking Mindset Encourage domain teams to adopt a product thinking mindset where they treat their data products as products that are designed, developed, and managed with a focus on customer needs and outcomes. Encourage them to follow product development practices such as defining product roadmaps, setting product goals, conducting user research, and incorporating feedback loops to continuously iterate and improve their data products. Enable Self-Serve Data Infrastructure Provide domain teams with the autonomy to choose their data infrastructure components, tools, and technologies that best suit their domain's requirements. This may include data ingestion, storage, processing, and visualization technologies. Establish guidelines and standards to ensure consistency and interoperability while allowing domain teams the flexibility to innovate and experiment with new technologies. Promote Domain-Driven Design Encourage domain teams to adopt domain-driven design principles, where they model their data products based on the specific needs of their domain. This includes defining domain-specific data models, APIs, and data contracts that are tailored to the requirements of their domain's data consumers. This promotes the reusability, scalability, and extensibility of data products. Establish Data Governance Define clear guidelines and standards for data governance, including data quality, security, privacy, and compliance. Ensure that domain teams adhere to these standards and implement necessary data governance practices in their data products. This may include data profiling, data lineage, data cataloging, and data access controls. Foster Collaboration and Communication Encourage cross-functional collaboration and communication between domain teams, data operations (DataOps) teams, data scientists, and data consumers. Foster a collaborative culture where teams share knowledge, best practices, and lessons learned. This can be facilitated through regular meetings, workshops, knowledge-sharing sessions, and collaboration tools. Invest in Training and Education Provide training and education to domain teams and other stakeholders to ensure a common understanding of Data Mesh principles, practices, and tools. This may include technical training on data engineering technologies, product management, domain-driven design, and agile practices. It is essential to invest in the development of skills and capabilities needed for the successful implementation of Data Mesh. Continuously Monitor and Improve Implement monitoring and observability practices to track the performance, reliability, and scalability of data products. Collect feedback from data consumers and iterate on data products to continuously improve their quality and relevance. Monitor and measure key performance indicators (KPIs) to assess the impact and value of Data Mesh implementation. Implementing Data Mesh is not a one-time task but an ongoing process that requires continuous improvement, learning, and adaptation. In addition, it requires a collaborative effort from different teams within the organization and a commitment to embrace a culture of autonomy, ownership, and innovation. By following these steps and continuously improving the implementation, organizations can successfully adopt Data Mesh architecture and unlock the full potential of their data assets. Conclusion Data Mesh architecture is a paradigm shift in data engineering that promotes domain-oriented ownership, self-serve data infrastructure, product thinking, and domain-driven design. It provides organizations with improved scalability, agility, data democratization, and innovation. However, implementing Data Mesh requires cultural and organizational changes and addressing challenges related to managing distributed data pipelines and products. Organizations that successfully embrace Data Mesh can unlock the full potential of their data assets and drive better data outcomes.
There is a project. Code is written, everything is tested, coverage is high, and features get delivered. There are bugs from time to time, but incidents get fixed fast, and no one worries about the situation. Well, business as usual. Time passes... From time to time, there are situations where people get nervous due to critical issues after the release. All hands on board, and after several stressful hours, things get fixed. Time passes... The number of incidents gets bigger. Teams spend more time fixing them. And people start to wonder how to get back to comfortable times from the past... Let’s Make the World a Better Place! There is a decision to run a post-mortem to find a way to make everyone's life easier. Everyone wants to solve the problem, every developer, tester, manager, analyst, etc. What do you need more? The meeting is fruitful, and there are a lot of ideas: add more tests; analyze requirements more carefully; invest more time in knowledge sharing. Yet another productive meeting with a great outcome! What Just Happened? It is worth noticing decision teams made added more work. They need to write more tests, spend more time on analysis and run additional knowledge-sharing sessions. Like it or not, they just decided to increase their lead time. Of course, it does not have to mean they made a wrong decision. Sometimes it is worth doing so for a short period as an investment. The question is: are we talking about this scenario now? Did They Make a Good Decision? Things are tricky because, on the surface, everything makes sense: more tests, better analysis, knowledge-sharing sessions — it does not sound like anything harmful. Yet, what is the missing part of this equation? When things get worse, the first step should not be to look for improvements but for the root cause of a problem. You should first look at things that changed. In our imaginary scenario: the code coverage is still the same as it was when everything was fine; developers were and are writing tests; analysts are sharing requirements using the same practices they are using for years. If this is true, why do we think we need to improve the good practices we are using for years? If the same exemplary techniques with the same quality are used, why do we think there's a problem in those areas? Is the “Root Cause” You Found the Real Root Cause? When you face a problem that was not present in past, you should first look for the things that changed. I'm not saying this is always the cause of the problem, but this is where you should start. What did NOT change? The code is written in the same manner, the coverage and metrics are on the same level, the quality of devs, testers, and analysts is still the same, and the same techniques and practices are used to deliver new functionality. What changed? The complexity of the project is different. Each feature, each new definition, and supported scenario is new knowledge. This is great, but this gets bigger over time. And this is where you should look for improvements. Instead of adding more tests, spending more time on analysis, spending more time on meetings, and adding more checks and validations, you should think about how to tackle the greater complexity of your project. Is that easy? Of course, not. But having that and fast delivery in mind, you can finally look for the improvements your project needs.
Nowadays we live in a world ruled by trending topics, and sometimes we forget the basics like being more effective and productive. I see more and more how companies' technology leaders look for new technological products or new architectural patterns to solve the company's needs, and they forget that working hard can provide strong efficiency principles for their organization. Have you ever wondered how many hours your team works on irrelevant or unnecessary tasks? Squads and engineers are usually people involved in their work and make a great effort to achieve their goals, but many times a large part of this effort is dedicated to tasks without any business impact. C-levels, engineering managers, or tech leaders should be focused on increasing engineers' productivity. Effectiveness and productivity can be increased with the help of ChatGPT, Data Mesh, or Cloud, but you can also increase it by promoting a strong company culture oriented to increasing focus time for our engineers. In the last few months, I have also seen a wave of companies that are requesting their employees to come back to the office, as they have realized remote work is not efficient. In my opinion, the problem is that we have brought the inefficiencies of the office to remote work and sometimes we have even increased them. Main Noise Generators There are several noise generators in our day-to-day. The following are the most important: Message systems such as Slack, Discord, or Teams. Meetings: People love them. The emergent tasks: Everything is important and urgent. Message Systems A few years before the pandemic, message systems such as Slack, Discord, or Microsoft Teams were not the main communication channel for companies that did not work remotely. The main communication channels were email, phone calls for urgent issues, or to go to a workmate's desk: Managers spend a lot of time reviewing emails and replying to them, but always in an asynchronous way. At the office, managers or principal engineers had always people sitting next to him/her solving doubts three or four times a day. In those days and for the engineers of those companies, working at home meant an amazing amount of focus time. The world has changed a lot in the last few years. Many companies have adopted remote work, but not many of them applied any good practices; they projected their way of working in the office to the remote model. In addition, nowadays, we are continuously unfocused because of mobile and social media applications, and unfortunately, this is being transferred to society as a whole and to the way we work. New message systems are great communication tools that make communication easier and more effective. If we were to use these tools in a good way, we could increase productivity. But often these tools generate a lot of noise and unfocus people's attention. These are some of the symptoms of improper usage: There are too many messages all day in general channels. People usually reply the messages instantly. Several engineers of the same squad reply to messages during the same hours. Requests, incidents, and emergent tasks are requested and resolved in global channels. We do not have to stop using them, but we do need to use them better and try to guarantee focus time for our teams. The message systems have to be an asynchronous chat for most of the team, and of course, there are special cases like support teams or on-call engineers that should be focused on the chat. Tips to Improve Some recommendations to improve efficiency: Reduce and optimize the number of public channels, since the more channels there are, the more noise is generated. Set times for posting messages on general channels regarding global communications, for example early in the morning. On the public squad channel, always establish an on-call person. It is a fully asynchronous system, except for the channels used for on-call and incidents. Policy for muting non-priority channels. Meetings There are three main characteristics of meetings: People love to create meetings because it is a way to socialize. Meetings kill productivity. There is always time for another meeting. This problem has increased a lot with remote work. At the office, these meetings were usually done in the meetings rooms so there were physical restrictions. If there were no available rooms, there were no meetings. Remote work and the new communication tools have eliminated these physical restrictions, so now there are more meetings than ever. These are some of the symptoms of improper usage: Often several people are not paying attention or participating in the meeting. Many brainstorming meetings. Managers with a full calendar of meetings. Many mid-morning meetings. People do not come prepared for meetings. No documents or very long documents, without clear goals, business value, or impact analysis. Many meetings with more than 6 people. In addition to sharing the agenda and goals, it is a good practice to write a summary of the contents and share it before the meeting. This helps people to establish the messages they want to convey and the rest of the participants to prepare for the meeting. Often this prevents the meeting and can be solved asynchronously. Here is an exercise to check the productivity of the meetings we have in the coming week: On Monday morning, we have to review all the meetings scheduled for the next 7 days. We are going to create a table with the following information: Meeting name. Meeting description. Date and duration. Number of people Are the agenda and goals indicated? What are my goals for this meeting? Were the meeting goals achieved? Were my goals achieved? Has a follow-up meeting been scheduled? Good and bad decisions were taken in the meeting. What do we need to improve at the next meeting? After a month, analyze the results and share them with the team. Tips to Improve Some recommendations to improve efficiency: Set hours of guaranteed focus time without meetings. For example, if everyone works in the same time zone and similar hours, avoid meetings in the mornings. All meetings must have a previous document summarizing the topics to be discussed, who is responsible for each point, what is the goal, and the information required to work on it. The premise of resolving points in the asynchronous model should always be applied and the meeting should only be scheduled if it is really necessary. A clear example is the reporting meetings; you do not need to create meetings to analyze the progress. Firstly, you should review it asynchronously, and if you have doubts that require a meeting, then you plan it. If the meeting is not prepared, it is better to cancel it even during the meeting. Don't worry about being honest. Unprepared meetings are useless. Reducing the number of meetings requires difficult decisions because most of them love to meet. Emergent Tasks These tasks are unplanned but can be very important. All teams have emergent tasks, usually, operation and support teams have more but engineering teams work on product development too. The following tasks are an example: Security bug. Incidents with business impact. Request from the manager. Support to clarify customer questions. Unplanned tasks change the focus of the team and often involve significant cognitive changes. There are four main problems in managing these tasks: Are always urgent for the person requesting them, and sometimes it is difficult for teams to know what the real priority is. generate chaos, often a large part of the team is working on them. When they are extended over time, they generate a lot of demotivation in people. A lot of time is dedicated to tasks that have no business value or high impact. We have to define a management model for these types of tasks that will end up generating a big impact on the team both in terms of effort and value delivery. The time of our teams is a limited resource and should be invested in tasks that add business value. There are many tasks that if we don't execute, surely no one will ask about them. Tips to Improve It is important that these recommendations are aligned with the company's OKRs and agreed upon at the company level: Allocate a percentage of weekly effort dedicated to this type of task, except for serious incidents with business impact. Assign a rotating role for the management of such tasks and prevent them from impacting the focus of the team. Define a must information for this kind of request such as: Summary Description Priority Business impact Expected resolution date Alignment with OKRs Define a decision matrix to determine what is important and what is not. The Eisenhower matrix is a good example method for prioritizing tasks. Conclusions I see many companies requesting their engineers to come back to the office and abandon remote work stating that employees are less productive. Many CEOs are analyzing how new AI solutions can increase the productivity of their engineers in many cases without having analyzed or tested the real value of these solutions on engineer day-to-day, I don't think these are the only solutions but the easiest and most obvious ones to choose. More focus time means more productivity and that should be one of our goals as managers. Promoting a culture in our organization where the focus is an important part is key to achieving success in the most optimal way possible. Productivity starts primarily with the C-levels, directors, and managers because they are the main ones responsible for promoting culture and also one of the largest generators of noise.
Dependency Poker is an Agile game — similar to planning poker — that enables teams to identify and manage dependencies in the development process. It can be utilized in Backlog Refinement or SAFe's PI Planning to enhance collaboration and reduce project risks. The Cards To play Dependency Poker, your team must create or obtain cards representing different types of dependencies. These are used in the first step to identify dependencies between software components or features. The second set of cards, strategy cards, are used to find effective solutions for managing dependencies and mitigating risks. 1. Dependency Cards In the paper "A Taxonomy of Dependencies in Agile Software Development," dependency categories were identified. The list describes different types of dependencies that can affect the completion of work in a software development project. These include Requirements, Expertise, Task Allocation, Historical, Activity, Business Process, Entity, and Technical Dependencies. In unFix, Planning and Environment dependencies are added. Self-printed Cards from unFix 2. Dependency Breaker unFix provides a list with 20 Dependency Breaker: Block, Throttle, Prioritize, Visualize, Flag, Measure, Automate, Standardize, Redesign, Decouple, Toggle, Contain, Coordinate, Attend, Reteam, Rotate, Swarm, Volunteer, Formalize, and Self-Service. Dependency Basher Cards 3. Best Practice Card Sets The Dependency Breaker cards from Unfix are a popular and widely used set of dependency cards that represent the different types of dependencies that can exist between software components or features. The Game To start the game, the team needs to understand the different types of dependencies by explaining Dependency Poker dependency type cards to each other. The perfect time to perform Dependency Poker is during SAFe's PI Planning, Scrums Backlog Refinement, or during Workshops. Placing all Cards on a to-be-Refined Backlog ItemThe team then places all(!) of the Dependency Cards on the next backlog item with the highest order. The team can then work together to remove the cards that are not relevant or that have been resolved already by using consensus-based decision-making to ensure that all team members have a shared understanding of the dependencies involved in the development of the backlog item. Clustering the DependenciesIn the second step of Dependency Poker, the team clusters dependencies together to understand their interconnections and develop strategies to manage them more effectively. Different team members may have different views on dependencies, and identifying clusters helps identify potential risks and dependencies that may have been overlooked. Finding a SolutionAfter clustering dependencies in Dependency Poker, the team finds and commits to strategies to handle them. Each participant uses his/her set of strategy cards and throws them on the cluster of dependencies to discuss which strategies are most effective. The goal is to agree on a shared strategy and commit to implementing it. Dot voting can be used to identify the best options, and the team should document these for further analysis. This step ensures that the team has a shared understanding of dependencies and a plan to address them, improving collaboration and reducing project risks. Analyzing the identified dependencies and strategies can provide valuable insights into how to optimize your business agility. By reviewing the strategies that the team has committed to, you can determine if they are effective and if there are any additional steps that could be taken to improve the management of dependencies. The growing Dependency Poker website provides a wealth of resources, including experience reports and artifacts, to help teams improve their dependency management process. By utilizing these resources and continuously evaluating and improving their approach to handling dependencies, teams can increase their agility, reduce project risks, and deliver successful outcomes.