The Sprint Methodology in Agile Project Management
Contents |
Abstract
Breaking down projects into smaller parts is paramount to being adaptive and agile as well as for maintaining the interest and momentum towards stakeholders [1]. Working in sprints facilitates a quick and continuous review of results allowing regular feedback and thereby keeping on the right path towards a successful project for all parties involved. This eventually also helps ensuring that the project continues being profitable, which can be critical towards the funding of the project [1]. This short-term iterativeness thus challenges the traditional way of conducting projects, also known as Waterfall project management, which is characterised by more robustness and formalities involving very thorough preliminary planning as well as the need for formal acceptance of the project from management and thereto funding [2]. Sprints are part of the agile project management (APM) framework called scrum, which is typically applied within IT and software development processes. IT and software development projects are affected by the constant technological development and a very competitive market, which makes APM an appropriate way of managing these projects, as it responds to high levels of change and uncertainty [3] [4].
This article investigates relevant aspects of the APM methodology, known as sprints, within data science projects. It provides concrete and hands-on recommendations to project managers or scrum masters of data science projects who are about to plan and conduct sprints with their team.
Motivation
According to PRINCE 2, a project is "a temporary organization that is created for the purpose of delivering one or more business products according to an agreed Business Case."[5]. To deliver such business products, a project needs to be managed with respect to many aspects such as costs, timescales, quality, scope, risks and benefits. PRINCE 2 thus defines project management as "the planning, delegating, monitoring and control of all aspects of the project, and the motivation of those involved, to achieve the project objectives within the expected performance targets for time, cost, quality, scope, benefits and risks."[5].
Projects go through a series of phases, also known as a project's life cycle. These phases can differ from project to project but are typically broken down by deliverables or milestones. Generically, the phases can, according to the PMBOK® Guide, be summed down to the starting of the project, the organizing and preparing, the carrying out of project work and eventually the closing of the project [3].
In the traditional predictive life cycle, the products and projects' deliverables are clearly defined at the beginning with little room for changes in the scope of the project without needing careful managing. The project scope, costs and time are thus defined as early in the process as possible. This requires that a project is definable within all its aspects at an early stage and is characterised by low levels of change. However, some projects are characterised by such an amount of uncertainty that the early defining of aspects like project scope, cost and time simply is too difficult. This is where agile project management and the adaptive life cycles become relevant [3].
Taking offset in agile software development, APM takes emphasis in two main concepts. These two concepts help defining how project teams can adapt rapidly to these unpredictable and changing requirements and environment. First, risk is minimized when the team focuses on short iterations with clearly defined deliverables. Secondly, the feedback loop created by these short iterations facilitates the direct communication with partners during the development process avoiding unnecessary documentation and bureaucracy [4].
Agile Project Management
Scrum Process
Adaptive life cycles are characterised by very rapid iterations of approximately 2-4 weeks, which are each fixed in time and costs. In the beginning of each of these iterations, the product backlog list, which is defined by the decomposition of the overall project scope and requirements, is reviewed in order to determine, which of the items that can be delivered within the next iteration [3].
Within APM, there exists a large number of different frameworks that can be used. Three of the most frequently used ones, namely the scrum, lean and kanban frameworks, are illustrated on figure 1. This article focuses on how data science teams can apply the adaptive method called sprints to their projects, which is part of the APM framework called scrum. The scrum is mainly used within IT projects but is also applied within product development projects [1].
The scrum consists of three major components: the roles, process and artifacts. The team working on the scrum is cross-functional and working full-time on the project. The scrum process counts five overall activities: the kickoff, the sprint planning meeting, the sprint, the daily Scrum, and the sprint review meeting, which all can be seen on figure 2 [4].
At the beginning of the project, the scrum team, the scrum master and the product owner all meet together to kick it off. At this kick-off meeting, the high-level backlog for the project and its major goals are defined by the team. When these are defined, the same participants all meet again at the beginning of each sprint for a sprint planning meeting. Here, they define the product backlog, also known as the project requirements, and they determine the sprint goal, which is the formal outcome of the sprint. Having defined these two main parts, the participants consequently create the sprint backlog. These sprint planning meetings usually take up to a day [4]. Having planned the sprint, the latter is now ready to begin.
Sprint Methodology
What characterises sprints, besides being short iterations of maximum four weeks, is that no external interference should influence the work of the scrum team, which also means that project requirements cannot be changed during a sprint. In general, it is recommended to hold short daily stand-up meetings known as scrum meetings of 15 minutes during a sprint. Here, relevant matters such as what has been done since last scrum meeting and what will be done within next scrum meeting can be discussed.
The purpose of these daily meetings is to hold track of how the project is progressing, which can help the scrum master keeping track of the team and ensure as efficient a sprint as possible [4]. The meetings are short and informal leaving no room for problem solving or thorough explanations, hence the importance of standing up during these meetings, which can help the team being focused and removing obstacles to progress [2].
Within data science project, a number of five additional meetings are recommended to be held within each sprint. These meetings will be discussed further in the Application section. After each sprint is finished, it is reviewed during the sprint review meeting. Here, the sprint product is demonstrated to the product owner.
Application
Data Science Life Cycle
Data Science projects (DSLC) differ from Software Development (SDLC) and Data Mining projects (CRISP-DM) as they are of a more experimental and exploratory nature and thus need a less rigid project life cycle. In data science projects, the team learns from data and adjusts its work to the results accordingly. The amount of planning and interdependencies between the project phases in SDLC and CRISP-DM projects are thus not suited for DSLC projects [6]. The more lightweight and agile DSLC sprint consists of six phases, taking offset in the scientific method. These six phases are as follows:
- Identify
- Question
- Research
- Results
- Insights
- Learn
Unlike in the SDLC where each step leads to the next, the data science team can in the DSLC cycle through the questioning, researching and results phases, as is illustrated on figure 3. This inner cycle provides DSLC projects flexibility and agility. In the DSLC phases, the data science team starts by identifying the key roles and players in a given context in the identification phase. Then, in the questioning phase, the team starts asking relevant questions about these key players in order to explore the data the best way possible. To help the data science team with identifying all the right questions, the use of a question board is recommended. The question board allows the rest of the organization to contribute with its questions. The board should be easily accessible for all to add a post-it when walking by [1].
During the research phase, the data analyst explores the data based on the identified players and questions, also known as research topics. Here, the team will work closely with the data analyst to make sure that there is compliance in the desired outputs. While the data analyst explores the data within the given research topics, short reports with the results from each research topic are produced for the team during the results phase. These short reports can subsequently be used if a research topic and its results provide a solid basis for further research. When all research topics have been explored, the team evaluates the results and assesses if they provide any valuable and interesting insights. Based on these insights, the team will in the last phase try to create organizational knowledge, which eventually can provide actual value to the rest of the organization showing what has been learned from the research [6].
Data Science Sprint Meetings
In order to run scrums efficiently, it is necessary to manage sprints carefully. A way to ensure that the sprint is on track is to hold regular meetings. For data science teams it is recommended to run five different types of meetings during a sprint, which is recommended to last no longer than two weeks [1]. These five meetings are as follows:
- Research Planning
- Question Breakdown
- Visualization Design
- Storytelling Session
- Team Improvement
The meetings are to be held together for every sprint to make sure that the data science team provides interesting insights at the end of the sprint, as is illustrated on figure 4. These five types of meetings have different purposes and they all together ensure that each of the six phases of a DSLC sprint is covered properly. It is essential that these five sprint meetings are all time-boxed. When a meeting is time-boxed, the team agrees on an amount of time, in which it will be held, prior to the meeting. What will be agreed upon during this meeting will thus have to last for the rest of the sprint, since time-boxed meetings cannot be rescheduled or followed-up upon. This helps ensuring that a sprint is rapid and agile, focusing on exploring and experimenting rather than planning and organizing[1].
Research Planning
Starting with the first sprint meeting, namely the Research Planning meeting, the data science team evaluates all the questions that have been identified during the identification phase and on the question board. During this typically two-hour long meeting, the team selects the most interesting questions and the data analyst and research lead subsequently prepares the sprint agenda. The sprint agenda is thus based on a compromise between the data analyst and research lead to ensure minimum viable research topics exploring data but avoiding spending unnecessary time on it.
Question Breakdown
During the Question Breakdown meeting, the data science team all meet to review any new questions from the question board and try to come up with some new ones. This type of meeting is also to identify potential clusters in the questions and whether any questions can be broken down into easier manageable questions. It is recommended to have two one-hour sessions of such meetings during a sprint. These meetings ensure a constant flow in the research topics and that the whole team is in compliance. It is also a good setting for prioritizing all questions and preparing the following sprint.
The Research Planning meetings and Question Breakdown meetings are thus two ways of managing the first two phases of the DSLC: the identifying and questioning phase.
Visualization Design
During the third meeting, the Visualization Design meeting, the data analyst and research lead develops an interesting visualization together based on results from the data analyst’s experimentation with data. This visualization will be used in the Storytelling Session, where the data analyst will tell the “story” behind the sprint results. The meeting should not last more than an hour and the visualizations can remain as a rough draft.
Storytelling Session
Based on the visualizations done by the data analyst and research lead, the data science team meets at a Storytelling Session to present the story of what they have learned during the sprint. The meeting should last an hour, in which data visualizations are shown, questions from the question board are discussed and the stories behind the questions are told.
Team Improvement
At the end of the sprint, the data science team meets for a two-hour improvement meeting where the team progress is discussed and evaluated. Here, potential changes can be agreed upon for the next sprint and this helps ensuring that the team is in constant progress and learning fast from its mistakes and successes.
Burndown Charts
Explain the relevance and use of Burndown charts (typical for sprints)
Limitations
- Is the risk really always minimised?
- Are there other risks combined with APM?
- what are the limitations/pitfalls to running sprints?
References:
Avoiding Pitfalls in Delivering in Data Science Sprints (Doug, 2016)
RISKS CHARACTERISTIC OF AGILE PROJECT MANAGEMENT METHODOLOGIES AND RESPONSES TO THEM (WALCZAK, 2013)
Glossary
APM - Agile Project Management
DSLC - Data Science Life Cycle
SDLC - Software Development Life Cycle
CRISP-DM - Cross Industry Standard Process for Data Mining
Annotated Bibliography
Rose D. (2016) Working in Sprints. In: Data Science. Apress, Berkeley, CA
Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute
H. Frank Cervone (2011) Understanding agile project management methods using Scrum.
References
remember to add page numbers! be clear on the roles in scrum teams and sprints
- ↑ 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Rose D. (2016) Working in Sprints. In: Data Science. Apress, Berkeley, CA
- ↑ 2.0 2.1 Maylor H. (2010) Project Management. In: Financial Times Prentice Hall
- ↑ 3.0 3.1 3.2 3.3 Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute
- ↑ 4.0 4.1 4.2 4.3 4.4 H. Frank Cervone (2011) Understanding agile project management methods using Scrum. In: OCLC Systems & Services: International digital library perspectives
- ↑ 5.0 5.1 Office Of Government Commerce (2009) Managing Successful Projects with PRINCE2™. In: TSO
- ↑ 6.0 6.1 Rose D. (2016) Using a Data Science Life Cycle. In: Data Science. Apress, Berkeley, CA