The Sprint Methodology in Agile Project Management

From apppm
(Difference between revisions)
Jump to: navigation, search
(Data Science Life Cycle)
 
(162 intermediate revisions by one user not shown)
Line 1: Line 1:
 +
''Developed by Sarah Bourdiaux Terp''
 +
 +
 
== Abstract ==
 
== Abstract ==
  
Breaking down projects into smaller parts is paramount to being adaptive and agile as well as for maintaining the interest and momentum towards stakeholders <ref name=" Sprints " />. Working in sprints facilitates a quick and continuous review of results allowing regular feedback and thereby keeping on the right path towards a successful project for all parties involved. This eventually also helps ensuring that the project continues being profitable, which can be critical towards the funding of the project <ref name=" Sprints " />. This short-term iterativeness thus challenges the traditional way of conducting projects, also known as Waterfall project management, which is characterised by more robustness and formalities involving very thorough preliminary planning as well as the need for formal acceptance of the project from management and thereto funding <ref name=" Harvey " />. Sprints are part of the agile project management framework called scrum, which is typically applied within IT and software development processes. IT and software development projects are affected by the constant technological development and a very competitive market, which makes agile project management an evident way of managing these projects, as it responds to high levels of change and uncertainty <ref name=" PMBOK " /> <ref name=" Scrum " />.  
+
Working in sprints facilitates a quick and continuous review of results allowing regular feedback and thereby keeping on the right path towards a successful project for all parties involved. This eventually also helps ensuring that the project continues being profitable as well as maintaining the interest and momentum towards stakeholders, which can be critical towards the funding of the project <ref name=" Sprints " />. This rapid iterativeness thus challenges the traditional way of conducting projects, also known as waterfall project management. Traditional project management is characterized by more robustness and formalities involving very thorough preliminary planning as well as protracted requirements definition. The long lasting requirements definition often results in outdated requirements before the project development even has begun. Agile project management (APM) was thus developed to meet the needs of a new type of project where the development phase is more crucial than the planning phase, namely within information systems and technology projects <ref name=" Scrum " />.
  
This article investigates relevant aspects of the agile project management methodology, known as sprints, within data science projects. It provides concrete and hands-on recommendations to project managers of data science projects who are about to plan and conduct sprints with their team.
+
Sprints are part of the APM framework scrum, which is typically applied within IT projects, software development projects and projects of a more exploratory nature like data science projects. These types of projects are affected by the constant technological development, which makes sprints an appropriate way of managing them, as this methodology helps responding to high levels of change and uncertainty <ref name=" PMBOK " /> <ref name=" PRINCE " />.  
  
== Agile Project Management ==
 
  
===Motivation===
+
This article investigates relevant aspects of the sprint methodology within data science projects. It provides concrete and hands-on recommendations to project managers or scrum masters of data science projects who are about to plan and conduct sprints with their team.
  
According to PRINCE 2, a project is "a temporary organization that is created for the purpose of delivering one or more business products according to an agreed Business Case."<ref name=" PRINCE " />. To deliver such business products, a project needs to be managed with respect to many aspects such as costs, timescales, quality, scope, risks and benefits. PRINCE 2 thus defines project management as "the planning, delegating, monitoring and control of all aspects of the project, and the motivation of those involved, to achieve the project objectives within the expected performance targets for time, cost, quality, scope, benefits and risks."<ref name=" PRINCE " />.
+
==Motivation==
  
Projects go through a series of phases, also known as a project's life cycle. These phases can differ from project to project but are typically broken down by deliverables or milestones. Generically, the phases can, according to the PMBOK® Guide, be summed down to the starting of the project, the organizing and preparing, the carrying out of project work and eventually the closing of the project <ref name=" PMBOK " />.  
+
According to PRINCE 2, a project is "a temporary organization that is created for the purpose of delivering one or more business products according to an agreed Business Case."<ref name=" PRINCE " />. To deliver such business products, a project needs to be managed with respect to many aspects, defined as "the planning, delegating, monitoring and control of all aspects of the project, and the motivation of those involved, to achieve the project objectives within the expected performance targets for time, cost, quality, scope, benefits and risks.", according to PRINCE 2 <ref name=" PRINCE " />.  
  
In the traditional predictive life cycle, the products and projects' deliverables are clearly defined at the beginning with little room for changes in the scope of the project without needing careful managing. The project scope, costs and time are thus defined as early in the process as possible. This requires that a project is definable within all its aspects at an early stage and is characterised by low levels of change. However, some projects are characterised by such an amount of uncertainty that the early defining of aspects like project scope, cost and time simply is too difficult. This is where agile project management (APM) and the adaptive life cycles become relevant <ref name=" PMBOK " />.
+
Projects go through a series of phases, also known as a project's life cycle. These phases can differ from project to project but are typically broken down by deliverables or milestones. In general, the phases can, according to the PMBOK® Guide, be summed down to the starting of the project, the organizing and preparing, the carrying out of project work and eventually the closing of the project <ref name=" PMBOK " />.
+
In the traditional predictive life cycle, the products and projects' deliverables are clearly defined at the beginning with little room for changes in the scope of the project without needing careful managing. The project scope, costs and time are thus defined as early in the process as possible. This requires that a project is definable within all these aspects at an early stage and is characterized by low levels of change and uncertainty. However, some projects are characterized by such an amount of uncertainty that the early defining of such aspects simply is too difficult, which makes traditional project management and predictive life cycles inadequate. This is where APM and adaptive life cycles like sprints become relevant and challenge the traditional project life cycle as the one defined by the PMBOK® Guide <ref name=" PMBOK " />. New demands to projects thus require new methodologies and frameworks.  
Taking offset in agile software development, agile project management takes emphasis in two main concepts. These two concepts help defining how project teams can adapt rapidly to these unpredictable and changing requirements and environment. First, risk is minimized when the team focuses on short iterations with clearly defined deliverables. Secondly, the feedback loop created by these short iterations facilitates the direct communication with partners during the development process avoiding unnecessary documentation and bureaucracy <ref name=" Scrum " />.
+
  
===Scrum and Sprint Management ===
+
Taking offset in agile software development, APM takes emphasis in two main concepts that helps defining how project teams can adapt rapidly to these unpredictable and changing requirements. First, risk is minimized when the team focuses on short iterations with clearly defined deliverables. Secondly, the feedback loop created by these short iterations facilitates the direct communication with partners during the development process, which helps avoiding unnecessary documentation and bureaucracy <ref name=" Scrum " />.
  
Adaptive life cycles are characterised by very rapid iterations of approximately 2-4 weeks, which are each fixed in time and costs. In the beginning of each of these iterations, the product backlog list, which is defined by the decomposition of the overall project scope, is reviewed in order to determine, which of the items that can be delivered within the next iteration <ref name=" PMBOK " />.
+
== Agile Project Management ==
  
[[File:APM_overview2.png|thumb|left|451x418px|Figure 1: Agile Project Management Frameworks]]
+
===Scrum Roles and Process ===
 +
[[File:APM_overview2.png|thumb|right|451x418px|Figure 1: Agile Project Management Frameworks]]
  
Within agile project management, there exists a large number of different frameworks that can be used. Three of the most frequently used ones, namely the scrum, lean and kanban frameworks, are illustrated on figure 1. This article focuses on how data science teams can apply the adaptive method called sprints to their projects, which is part of the APM framework called scrum.
+
Adaptive life cycles like sprints are characterized by being very rapid iterations of approximately 2-4 weeks. Each sprint is fixed in time and cost at its beginning. Here, the product backlog list, also known as project requirements, is also reviewed in order to determine, which of the requirements that can be delivered within the next iteration  <ref name=" PMBOK " />.  
The scrum is mainly used within IT projects but is also applied within product development projects <ref name=" Sprints " />.
+
  
The scrum consists of three major components: the roles, process and artifacts. The team working on the scrum is cross-functional and working full-time on the project. The scrum process counts five overall activities: the kickoff, the sprint planning meeting, the sprint, the daily Scrum, and the sprint review meeting, which all can be seen on figure 2 <ref name=" Scrum " />.  
+
Within APM, a large number of different but useful frameworks exists. Three of the most frequently used ones, namely the scrum, lean and kanban frameworks, are illustrated in figure 1.
 +
Sprints are part of the scrum framework, which is mainly used within IT projects <ref name=" Sprints " />. The scrum consists of three major components: the roles, the process and the artifacts.  
  
 
[[File:scrums.png|thumb|right|451x418px|Figure 2: Scrum Activities]]
 
[[File:scrums.png|thumb|right|451x418px|Figure 2: Scrum Activities]]
  
At the beginning of the project, the scrum team, the scrum master and the product owner all meet together to kick it off. At this kick-off meeting, the high-level backlog for the project and its major goals are defined by the team. When these are defined, the same participants all meet again at the beginning of each sprint for a sprint planning meeting. Here, they define the product backlog, also known as the project requirements, and they determine the sprint goal, which is the formal outcome of the sprint. Having defined these two main parts, the participants consequently create the sprint backlog. These sprint planning meetings usually take up to a day <ref name=" Scrum " />.  
+
Starting with the roles, the scrum team is cross-functional and consists of a product owner, a scrum master and a scrum team. This cross-functional constellation works full-time on the project in order to get through the product backlog list in due time. The second major scrum component, namely the process, counts five overall activities: the kick-off, the sprint planning meeting, the sprint, the daily scrum, and the sprint review meeting, which all are illustrated in figure 2 <ref name=" Scrum " />. The third major scrum component, the artifacts, include the product backlog, the sprint backlog and the burn-down charts, which will be elaborated further in the ''Scrum Artifacts'' section.  
  
Having planned the sprint, the latter is now ready to begin. What characterises sprints, besides being short iterations of maximum four weeks, is, that no external interference should influence the work of the scrum team, which also means that project requirements cannot be changed during a sprint. In general, it is recommended to hold short, daily stand-up meetings known as scrum meetings of 15 minutes during a sprint, where relevant matters such as what has been done since last scrum meeting and what will be done within next scrum meeting can be discussed. The purpose of these daily meetings is to hold track of how the project is progressing, which can help the scrum master keeping track of the team and ensure as efficient a sprint as possible <ref name=" Scrum " />. The meetings are short and informal leaving no room for problem solving or thorough explanations, hence the importance of standing up during these meetings, which can help the team being focused and removing obstacles to progress <ref name=" Harvey " />.  
+
At the beginning of a project, the product owner, the scrum master and the scrum team all meet together to kick it off. This meeting occurs only once during a project. At this kick-off meeting, the high-level backlog for the project, which is the overall project scope and requirements, as well as its major goals are defined. Subsequently, the same participants all meet again at the beginning of each sprint for a sprint planning meeting. Here, they define the product backlog list i.e. the decomposition of the high-level backlog. Furthermore, they determine the sprint goal, which is the formal outcome of the sprint. Unlike the kick-off meeting, the sprint planning meeting is held at the beginning of each sprint within a project.
Within data science project, a number of five additional meetings are recommended to be held within each sprint. These meetings will be discussed further in the Application section.
+
After each sprint is finished, it is reviewed during the sprint review meeting. Here, the sprint product is demonstrated to the product owner.
+
  
==Application==
+
After defining these two main parts, the participants consequently outline the sprint backlog. These sprint planning meetings will usually take up to a day <ref name=" Scrum " />. Having planned the sprint, its backlog and goals, the team is now ready to explore the sprint’s research topics.
  
===Data Science Life Cycle===
+
===Scrum Artifacts===
  
 +
[[File:burn-down.png|thumb|right|500x500px|Figure 3: Examples of burn-down charts made in Excel (Inspired by <ref name=" fig1 " /> <ref name=" fig2 " /> <ref name=" fig3 " />)]]
  
[[File:data_science.png|thumb|left|451x418px|Figure 3: Data Science Life Cycle (Inspired by D. Rose, 2016 <ref name=" Sprints " />)]]
+
As mentioned, the scrum artifacts include the product and the sprint backlog as well as the burn-down charts. The product and sprint backlogs are defined during the sprint planning meetings. The product backlog is defined with the product owner whereas the sprint backlog is defined only by the scrum team, since this type of backlog is more specific to the individual sprint <ref name=" Scrum " />.
  
Data Science projects (DSLC) differ from Software (SDLC) and Data Mining projects (CRISP-DM) as they are of a more experimental and exploratory nature and thus need a less rigid project life cycle. In Data Science projects, the team learns from the data and adjusts its work to the results accordingly. The amount of planning and interdependencies between the project phases in SDLC and CRISP-DM projects are thus not suited for DSLC projects <ref name=" DSLC " />.  
+
The use of burn-down charts is intentionally applied within scrum unlike in traditional project management. Burn-down charts are simple two-dimensional charts showing the progress of a given process. The purpose of using burn-down charts is to provide relevant information about these progress in an easy-to-comprehend manner. This is especially relevant during scrum and sprints, where time is a limited resource and needs to be micro-managed carefully <ref name=" Scrum " />. Within burn-down charts, three types are commonly used. The sprint burn-down chart documents the progress of the sprint.  
The more lightweight and flexible DSLC framework consists of six phases, taking offset in the scientific method. These six phases are, as illustrated on figure 3, the identifying, the questioning, the researching, the results, the insights and the learning. Unlike the SDLC where each step leads to the next, the data science team can cycle through the questioning, researching and results phases. This inner cycle provides DSLC projects flexibility.  
+
The release burn-down chart documents the progress of the release and the product burn-down chart documents the overall project progress <ref name=" Scrum " />. An illustration of how these three types of burn-down charts can look like can be seen in figure 3.  
In the DLSC phases, the Data Science team starts by identifying the key roles and players in a given context. Then, the team starts asking relevant questions about these key players within the same context in order to explore the data the best way possible. During the research phase, the data analyst explores the data based on the identified players and questions. Here, the team will work closely with the data analyst to make sure that there is compliance in the desired research topic. While the data analyst explores the data within the given research topics, a short report with the results from each research topic is produced for the team. These short reports can subsequently be used if a research topic and its results provide a solid basis for further research. When all research topics have been explored, the team evaluates the results and assesses if they provide any valuable and interesting insights. Based on these insights, the team will try to create organizational knowledge, which eventually can provide actual value to the rest of the organization showing what has been learned from the research <ref name=" DSLC " />.
+
  
===Data Science Sprint Meetings===
+
As the examples in figure 3 illustrate, each task is typically represented on the x-axis in terms of time with the duration on the y-axis. The sprint burn-down chart is on a very operational level, showing the day-to-day progress by depicting the total backlog hours remaining in the sprint per day. These total backlog hours are estimated based on the amount of time left in the sprint. The sprint burn-down chart would thus decrease to zero hours remaining by the end of the sprint.
  
In order to run scrums efficiently, it is necessary to manage sprints carefully. A way to ensure that the sprint is on track is to hold regular meetings. For data science teams it is recommended to run five different types of meetings during a sprint. These five meetings are as follows:
+
The release burn-down chart works in a similar way, however, it represents the remaining time until the release will be done. Finally, the product burn-down chart is thus used to indicate the overall project progress on a higher level <ref name=" Scrum " />.  
  
• Research Planning
+
These burn-down charts can thus help the scrum team and especially the scrum master keeping an easy overview of how the project is progressing and on a more specific level also get a day-to-day overview of how each sprint is progressing.
  
• Question Breakdown
+
===Sprint Methodology===
  
• Visualization Design
+
What characterizes sprints, besides being short iterations of maximum four weeks, is that no external interference should influence the work of the scrum team. This also means that project requirements cannot be changed during a sprint. In general, it is recommended to hold short daily stand-up meetings of 15 minutes, known as scrum meetings, during a sprint. Here, relevant matters such as what has been done since last scrum meeting and what will be done within next scrum meeting can be discussed.
  
• Storytelling Session
+
The purpose of these daily meetings is to hold track of how the sprint and thereof the project is progressing. This can help the scrum master keeping track of the team and ensure as efficient a sprint as possible <ref name=" Scrum " />. The meetings are short and informal leaving no room for problem solving or thorough explanations, hence the importance of standing up during these meetings. Standing up can help the team being focused and remove any obstacles to progress <ref name=" Harvey " />.
  
• Team Improvement
+
Within data science projects, a number of five additional meetings are recommended to be held within each sprint. These meetings will be discussed further in the Application section. After each sprint is finished, it is reviewed during the sprint review meeting. Here, the sprint product is demonstrated to the product owner.
  
The meetings are to be held in the same order for every sprint, as is illustrated on figure 4.
+
==Application==
These five types of meetings have different purposes and they all together ensure that every phase of a sprint is covered properly.
+
  
Starting with the first sprint meeting, namely the Research Planning meeting,
+
===Data Science Life Cycle Sprints===
  
 +
Depending on the project type, a number of different adaptive life cycles exist. Data science projects differ from software development and data mining projects as they are of a more experimental and exploratory nature and thus need a less rigid project life cycle. In data science projects, the team learns from data and adjusts its work to the results accordingly. The amount of planning and interdependencies between the project phases in the software development life cycle (SDLC) and data mining life cycle (CRISP-DM) are thus not suited for data science projects. The data science life cycle (DSLC) sprint is better suited to data science projects due to its more lightweight and agile nature. It consists of six phases, taking offset in the scientific method <ref name=" DSLC " />. These six phases are:
  
Explain each meeting type and purpose of them
+
* Identify
 +
* Question
 +
* Research
 +
* Results
 +
* Insights
 +
* Learn
  
Explain the relevance and use of Burndown charts (typical for sprints)
+
[[File:data_science.png|thumb|right|451x418px|Figure 4: Data Science Life Cycle (Inspired by D. Rose (2016) <ref name=" DSLC " />)]]
  
[[File:sprint_meetings.png|thumb|left|451x418px|Figure 4: Data Science Sprint Meetings]]
+
Unlike in the SDLC where each phase leads to the next, a 
team can “cycle” through 
the questioning, researching and
 results phases in each DSLC sprint. This mechanism is illustrated by arrows in figure 4. This inner cycle provides 
data science projects flexibility and
agility. In the DSLC phases, the scrum team starts by identifying
 the key roles and players in a given
 context in the identification phase. Then, in the questioning phase of the DSLC sprint, the team starts asking relevant questions about these key players in order to explore the data the best way possible. To help the team identifying all the right questions, the use of a question board is recommended. The question board allows the rest of the organization to contribute with its questions. The board should be easily accessible for all to add a post-it with a question when walking by <ref name=" Sprints " />.
  
 +
During the research phase, the team’s data analysts explore the data based on the identified players and questions, also known as research topics. Here, the rest of the team will work closely with the data analysts to make sure that there is compliance with the desired outputs. When the data analysts have finished exploring data within the given research topics, a short report with the results from each research topic is produced during the results phase. These short reports can subsequently be used if a research topic and its results provide a solid basis for further research. When all research topics have been explored, the team evaluates the results and assesses if they provide any valuable and interesting insights. Based on these insights, the team will in the last phase try to create organizational knowledge, which eventually can provide actual value to the rest of the organization showing what has been learned from the research in the sprint <ref name=" DSLC " />.
  
 +
===Data Science Life Cycle Sprint Meetings===
  
 +
In order to run scrums efficiently, it is necessary to manage sprints carefully. A way to ensure that the sprint is on track is to hold regular meetings. For scrum teams in data science projects, it is recommended to run five different types of meetings during a sprint, which is recommended to last no longer than two weeks for data science projects <ref name=" Sprints " />. These five meetings are:
  
 +
* Research Planning
 +
* Question Breakdown
 +
* Visualization Design
 +
* Storytelling Session
 +
* Team Improvement
  
 +
The meetings are to be held together for every sprint to make sure that the scrum team provides interesting insights at the end of the sprint. These five types of meetings have different purposes and they all together ensure that each of the six phases of a DSLC sprint is covered properly, as is illustrated in figure 5. It is essential that these five sprint meetings are all time-boxed.
  
 +
A meeting is time-boxed when the team agrees upon the duration of the meeting prior to initiating it. What will be agreed during this meeting will thus have to last for the rest of the sprint, since time-boxed meetings cannot be rescheduled or followed-up upon. This helps ensuring that a sprint is rapid and agile, focusing on exploring and experimenting rather than planning and organizing <ref name=" Sprints " />.
  
 +
[[File:sprint_meetings.png|thumb|right|451x418px|Figure 5: Data Science Sprint Meetings (Inspired by D. Rose (2016) <ref name=" Sprints " />]]
  
 +
====Research Planning====
 +
Starting with the first sprint meeting, namely the Research Planning meeting, the scrum team evaluates all the questions that have been identified during the identification phase and on the question board. During this typically two-hour long meeting, the team selects the most interesting questions and the data analysts and scrum master subsequently prepares the sprint agenda. The sprint agenda is thus based on a compromise between the data analysts and scrum master to ensure minimum viable research topics exploring data but avoiding spending unnecessary time on it <ref name=" Sprints " />.
  
 +
====Question Breakdown====
 +
During the Question Breakdown 
meeting, the scrum team gathers to review any new questions
from the question board and try to
come up with some new ones. This
 type of meeting is also to identify potential clusters in the questions and whether any questions can be broken down into easier manageable questions. It is recommended to have at least two one-hour sessions of such meetings during a sprint. These meetings ensure a constant flow in the research topics and that the whole team is in compliance. It is also a good setting for prioritizing all questions and preparing the following sprint <ref name=" Sprints " />.
  
 +
Together with the Research Planning meetings, the Question Breakdown meetings are thus ways of managing the first two phases of the DSLC sprint: the identifying and questioning phase.
  
 +
====Visualization Design====
 +
During the third meeting, the Visualization Design meeting, the data analysts and scrum master develop interesting visualizations together based on the results from the data analysts’ experimentations with data. The meeting should not last more than an hour and the visualizations can remain as a rough draft <ref name=" Sprints " />. These visualizations will subsequently be used in the Storytelling Session.
  
 +
The Visualization Design is thus a meeting that helps managing the sprint results from the DSLC result phase and creating valuable insights during the DSLC insights phase.
  
 +
====Storytelling Session====
 +
Based on the visualizations done by the data analysts and scrum master, the scrum team meets at a Storytelling Session to present the story of what they have learned during the sprint. The meeting should last an hour, in which data visualizations are shown, questions from the question board are discussed and the stories behind the questions are told <ref name=" Sprints " />.
  
 +
The Storytelling Session is thus a meeting during the DSLC learning phase that helps transmitting the knowledge gained through the sprint.
  
 +
====Team Improvement====
 +
At the end of the sprint, the scrum team meets for a two-hour improvement meeting where the team progress is discussed and evaluated. Here, potential changes can be agreed upon before the next sprint and this helps ensuring that the team is in constant progress and learning fast from its mistakes and successes <ref name=" Sprints " />.
  
 +
== Limitations ==
  
 +
Running a project in sprints is an ideal methodology to use when projects are of an exploratory nature or simply require fast development and deployment <ref name=" Sprints " />. This perception of failing fast allows project teams to develop ideas and test them fast without having to spend too much time on planning and waiting unnecessarily long for high-level approval. Therefore, the overall focus is on developing and delivering minimum viable products and working further on them if they are a success and thus living up to the sprint goals and especially the project goals <ref name=" Scrum " /> <ref name=" Pitfall " />.
  
 +
However, even though APM methodologies are said to minimize risks, there are some limitations to be aware of both as a team and scrum master of DSLC sprint projects <ref name=" Scrum " />. Taking offset in the PMBOK Guide and relevant literature, two main limitations will be discussed.
  
 +
'''No interfering'''
  
 +
Since sprints are very limited in time, and especially DSLC sprints <ref name=" DSLC " />, it is crucial that the team works on the sprint backlog exclusively <ref name=" Scrum " />. This can be a challenge, especially in larger organizations where the risk of getting interrupted is very high due to the number of employees. Scrum teams are cross-functional and thus not teams that are necessarily used to working together. Therefore, other employees in the departments may not be aware of these sprints and reaching out to their colleagues for help as they may be used to. It is therefore important that the scrum master emphasizes the importance and priority of the sprint backlog and helps communicating this to team members' direct managers if necessary. 
  
 +
'''No objectives'''
  
 +
As DSLC sprints are mainly recommended when working on exploratory projects like data science projects, the team needs to adjust their usual working habits from working towards specific objectives to having to develop research topics based on relevant questions during the Research Planning meeting <ref name=" Pitfall " />. These research topics will then serve as requirements in the sprint backlog but as data science projects are exploratory, there is no way of defining the objectives prior to having explored the data.
  
 +
This way of working thus challenges the usual perception of work where there are clear objectives that have to be met before the project can come to an end <ref name=" PMBOK " />. As the PMBOK® Guide describes, a project end is reached when "the project's objectives have been achieved or when the project is terminated because its objectives will not or cannot be met..." <ref name=" PMBOK " />.
 +
This lack of specific objectives thus makes DSLC sprints unsuitable for other types of projects where there are some clear deliverables that need to be done. Such projects could, for example, be construction projects where there are a very clear start and end to the projects as well as clearly defined deliverables and success criteria. Also, these types of projects require a great deal of planning and funding, which is contradictory to the nature of DSLC projects <ref name=" DSLC " />.
  
 +
DSLC sprints are not explicitly part of well-renowned standards like the PMBOK® Guide or PRINCE 2, however, they are a subcategory of the adaptive and agile life cycle projects as described in the PMBOK® Guide <ref name=" PMBOK " />. These project life cycles all take offset in the APM methodology and DSLC sprints are thus an extension of the adaptive life cycles.
  
 +
== Glossary ==
  
 +
'''APM''' - Agile Project Management
  
 +
'''DSLC''' - Data Science Life Cycle
  
 +
'''SDLC''' - Software Development Life Cycle
  
 +
'''CRISP-DM''' - Cross Industry Standard Process for Data Mining
  
 +
== Annotated Bibliography ==
  
 +
'''Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value In: Data Science. Apress, Berkeley, CA:''' This book elaborates in details how to run data science projects successfully. This wiki article has primarily focused on chapters on how to deliver successful data science projects but the book also covers a lot of other relevant aspects. The first section defines what data science is. The second digs into how to build a well-functioning data science team and the last part focuses on how to ask the right questions during the Question Breakdown meetings.
  
 +
'''Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute:''' This guide provides widely accepted guidelines, rules and characteristics for project, program and portfolio management. In this wiki article, the PMBOK® Guide has primarily been used for its section on project life cycles, however, it covers many more relevant subjects within project, program and portfolio management. The PMBOK® Guide for example also provides extensive guidelines within Project Scope, Time, Cost and Quality Management as well as Project Risk Management. 
  
 
+
'''H. Frank Cervone (2011) Understanding agile project management methods using Scrum:''' This article provides a very clear understanding of what scrums are and how they relate to agile project management. Some details are addressed in this wiki article but further knowledge is to be found for the interested reader. It elaborates the origin of agile project management in the agile software development movement and provides tangible tools for managing scrum projects.
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
 
+
== Limitations ==
+
 
+
 
+
- Is the risk really always minimised?
+
 
+
- Are there other risks combined with APM?
+
 
+
- what are the limitations/pitfalls to running sprints?
+
 
+
 
+
References:
+
 
+
Avoiding Pitfalls in Delivering in Data Science Sprints (Doug, 2016)
+
 
+
RISKS CHARACTERISTIC OF AGILE PROJECT MANAGEMENT METHODOLOGIES AND RESPONSES TO THEM (WALCZAK, 2013)
+
 
+
== Annotated Bibliography ==
+
 
+
  
 
== References ==
 
== References ==
  
remember to add page numbers!
+
<references>
add glossary
+
add more subsections
+
  
<references>
+
<ref name="Sprints"> Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value - Chp. 13: Working in Sprints. In: Data Science. Apress, Berkeley, CA </ref>
<ref name="Sprints"> Rose D. (2016) Working in Sprints. In: Data Science. Apress, Berkeley, CA </ref>
+
 
<ref name="Harvey"> Maylor H. (2010) Project Management. In: Financial Times Prentice Hall </ref>
 
<ref name="Harvey"> Maylor H. (2010) Project Management. In: Financial Times Prentice Hall </ref>
 
<ref name="PMBOK"> Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute </ref>
 
<ref name="PMBOK"> Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute </ref>
 
<ref name="PRINCE"> Office Of Government Commerce (2009) Managing Successful Projects with PRINCE2™. In: TSO </ref>
 
<ref name="PRINCE"> Office Of Government Commerce (2009) Managing Successful Projects with PRINCE2™. In: TSO </ref>
 
<ref name="Scrum"> H. Frank Cervone (2011) Understanding agile project management methods using Scrum. In: OCLC Systems & Services: International digital library perspectives </ref>
 
<ref name="Scrum"> H. Frank Cervone (2011) Understanding agile project management methods using Scrum. In: OCLC Systems & Services: International digital library perspectives </ref>
<ref name="DSLC"> Rose D. (2016) Using a Data Science Life Cycle. In: Data Science. Apress, Berkeley, CA </ref>
+
<ref name="DSLC"> Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value - Chp. 12: Using a Data Science Life Cycle. In: Data Science. Apress, Berkeley, CA </ref>
 +
<ref name="fig1"> Inspiration for Sprint burn-down chart: http://www.sw-engineering-candies.com/blog-1/howtouseproduct-burndown-chartsandsprint-burndown-chartsnotonlyinscrumprojects </ref>
 +
<ref name="fig2"> Inspiration for Release burn-down chart: http://www.softwaretestingstudio.com/burndown-chart-agile-scrum/ </ref>
 +
<ref name="fig3"> Inspiration for Product burn-down chart: https://www.scrum-institute.org/Burndown_Chart.php </ref>
 +
<ref name="Pitfall"> Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value - Chp. 14: Avoiding Pitfalls in Delivering in Data Science Sprints. In: Data Science. Apress, Berkeley, CA </ref>
  
 
</references>
 
</references>

Latest revision as of 18:40, 16 November 2018

Developed by Sarah Bourdiaux Terp


Contents

[edit] Abstract

Working in sprints facilitates a quick and continuous review of results allowing regular feedback and thereby keeping on the right path towards a successful project for all parties involved. This eventually also helps ensuring that the project continues being profitable as well as maintaining the interest and momentum towards stakeholders, which can be critical towards the funding of the project [1]. This rapid iterativeness thus challenges the traditional way of conducting projects, also known as waterfall project management. Traditional project management is characterized by more robustness and formalities involving very thorough preliminary planning as well as protracted requirements definition. The long lasting requirements definition often results in outdated requirements before the project development even has begun. Agile project management (APM) was thus developed to meet the needs of a new type of project where the development phase is more crucial than the planning phase, namely within information systems and technology projects [2].

Sprints are part of the APM framework scrum, which is typically applied within IT projects, software development projects and projects of a more exploratory nature like data science projects. These types of projects are affected by the constant technological development, which makes sprints an appropriate way of managing them, as this methodology helps responding to high levels of change and uncertainty [3] [4].


This article investigates relevant aspects of the sprint methodology within data science projects. It provides concrete and hands-on recommendations to project managers or scrum masters of data science projects who are about to plan and conduct sprints with their team.

[edit] Motivation

According to PRINCE 2, a project is "a temporary organization that is created for the purpose of delivering one or more business products according to an agreed Business Case."[4]. To deliver such business products, a project needs to be managed with respect to many aspects, defined as "the planning, delegating, monitoring and control of all aspects of the project, and the motivation of those involved, to achieve the project objectives within the expected performance targets for time, cost, quality, scope, benefits and risks.", according to PRINCE 2 [4].

Projects go through a series of phases, also known as a project's life cycle. These phases can differ from project to project but are typically broken down by deliverables or milestones. In general, the phases can, according to the PMBOK® Guide, be summed down to the starting of the project, the organizing and preparing, the carrying out of project work and eventually the closing of the project [3]. In the traditional predictive life cycle, the products and projects' deliverables are clearly defined at the beginning with little room for changes in the scope of the project without needing careful managing. The project scope, costs and time are thus defined as early in the process as possible. This requires that a project is definable within all these aspects at an early stage and is characterized by low levels of change and uncertainty. However, some projects are characterized by such an amount of uncertainty that the early defining of such aspects simply is too difficult, which makes traditional project management and predictive life cycles inadequate. This is where APM and adaptive life cycles like sprints become relevant and challenge the traditional project life cycle as the one defined by the PMBOK® Guide [3]. New demands to projects thus require new methodologies and frameworks.

Taking offset in agile software development, APM takes emphasis in two main concepts that helps defining how project teams can adapt rapidly to these unpredictable and changing requirements. First, risk is minimized when the team focuses on short iterations with clearly defined deliverables. Secondly, the feedback loop created by these short iterations facilitates the direct communication with partners during the development process, which helps avoiding unnecessary documentation and bureaucracy [2].

[edit] Agile Project Management

[edit] Scrum Roles and Process

Figure 1: Agile Project Management Frameworks

Adaptive life cycles like sprints are characterized by being very rapid iterations of approximately 2-4 weeks. Each sprint is fixed in time and cost at its beginning. Here, the product backlog list, also known as project requirements, is also reviewed in order to determine, which of the requirements that can be delivered within the next iteration [3].

Within APM, a large number of different but useful frameworks exists. Three of the most frequently used ones, namely the scrum, lean and kanban frameworks, are illustrated in figure 1. Sprints are part of the scrum framework, which is mainly used within IT projects [1]. The scrum consists of three major components: the roles, the process and the artifacts.

Figure 2: Scrum Activities

Starting with the roles, the scrum team is cross-functional and consists of a product owner, a scrum master and a scrum team. This cross-functional constellation works full-time on the project in order to get through the product backlog list in due time. The second major scrum component, namely the process, counts five overall activities: the kick-off, the sprint planning meeting, the sprint, the daily scrum, and the sprint review meeting, which all are illustrated in figure 2 [2]. The third major scrum component, the artifacts, include the product backlog, the sprint backlog and the burn-down charts, which will be elaborated further in the Scrum Artifacts section.

At the beginning of a project, the product owner, the scrum master and the scrum team all meet together to kick it off. This meeting occurs only once during a project. At this kick-off meeting, the high-level backlog for the project, which is the overall project scope and requirements, as well as its major goals are defined. Subsequently, the same participants all meet again at the beginning of each sprint for a sprint planning meeting. Here, they define the product backlog list i.e. the decomposition of the high-level backlog. Furthermore, they determine the sprint goal, which is the formal outcome of the sprint. Unlike the kick-off meeting, the sprint planning meeting is held at the beginning of each sprint within a project.

After defining these two main parts, the participants consequently outline the sprint backlog. These sprint planning meetings will usually take up to a day [2]. Having planned the sprint, its backlog and goals, the team is now ready to explore the sprint’s research topics.

[edit] Scrum Artifacts

Figure 3: Examples of burn-down charts made in Excel (Inspired by [5] [6] [7])

As mentioned, the scrum artifacts include the product and the sprint backlog as well as the burn-down charts. The product and sprint backlogs are defined during the sprint planning meetings. The product backlog is defined with the product owner whereas the sprint backlog is defined only by the scrum team, since this type of backlog is more specific to the individual sprint [2].

The use of burn-down charts is intentionally applied within scrum unlike in traditional project management. Burn-down charts are simple two-dimensional charts showing the progress of a given process. The purpose of using burn-down charts is to provide relevant information about these progress in an easy-to-comprehend manner. This is especially relevant during scrum and sprints, where time is a limited resource and needs to be micro-managed carefully [2]. Within burn-down charts, three types are commonly used. The sprint burn-down chart documents the progress of the sprint. The release burn-down chart documents the progress of the release and the product burn-down chart documents the overall project progress [2]. An illustration of how these three types of burn-down charts can look like can be seen in figure 3.

As the examples in figure 3 illustrate, each task is typically represented on the x-axis in terms of time with the duration on the y-axis. The sprint burn-down chart is on a very operational level, showing the day-to-day progress by depicting the total backlog hours remaining in the sprint per day. These total backlog hours are estimated based on the amount of time left in the sprint. The sprint burn-down chart would thus decrease to zero hours remaining by the end of the sprint.

The release burn-down chart works in a similar way, however, it represents the remaining time until the release will be done. Finally, the product burn-down chart is thus used to indicate the overall project progress on a higher level [2].

These burn-down charts can thus help the scrum team and especially the scrum master keeping an easy overview of how the project is progressing and on a more specific level also get a day-to-day overview of how each sprint is progressing.

[edit] Sprint Methodology

What characterizes sprints, besides being short iterations of maximum four weeks, is that no external interference should influence the work of the scrum team. This also means that project requirements cannot be changed during a sprint. In general, it is recommended to hold short daily stand-up meetings of 15 minutes, known as scrum meetings, during a sprint. Here, relevant matters such as what has been done since last scrum meeting and what will be done within next scrum meeting can be discussed.

The purpose of these daily meetings is to hold track of how the sprint and thereof the project is progressing. This can help the scrum master keeping track of the team and ensure as efficient a sprint as possible [2]. The meetings are short and informal leaving no room for problem solving or thorough explanations, hence the importance of standing up during these meetings. Standing up can help the team being focused and remove any obstacles to progress [8].

Within data science projects, a number of five additional meetings are recommended to be held within each sprint. These meetings will be discussed further in the Application section. After each sprint is finished, it is reviewed during the sprint review meeting. Here, the sprint product is demonstrated to the product owner.

[edit] Application

[edit] Data Science Life Cycle Sprints

Depending on the project type, a number of different adaptive life cycles exist. Data science projects differ from software development and data mining projects as they are of a more experimental and exploratory nature and thus need a less rigid project life cycle. In data science projects, the team learns from data and adjusts its work to the results accordingly. The amount of planning and interdependencies between the project phases in the software development life cycle (SDLC) and data mining life cycle (CRISP-DM) are thus not suited for data science projects. The data science life cycle (DSLC) sprint is better suited to data science projects due to its more lightweight and agile nature. It consists of six phases, taking offset in the scientific method [9]. These six phases are:

  • Identify
  • Question
  • Research
  • Results
  • Insights
  • Learn
Figure 4: Data Science Life Cycle (Inspired by D. Rose (2016) [9])

Unlike in the SDLC where each phase leads to the next, a 
team can “cycle” through 
the questioning, researching and
 results phases in each DSLC sprint. This mechanism is illustrated by arrows in figure 4. This inner cycle provides 
data science projects flexibility and
agility. In the DSLC phases, the scrum team starts by identifying
 the key roles and players in a given
 context in the identification phase. Then, in the questioning phase of the DSLC sprint, the team starts asking relevant questions about these key players in order to explore the data the best way possible. To help the team identifying all the right questions, the use of a question board is recommended. The question board allows the rest of the organization to contribute with its questions. The board should be easily accessible for all to add a post-it with a question when walking by [1].

During the research phase, the team’s data analysts explore the data based on the identified players and questions, also known as research topics. Here, the rest of the team will work closely with the data analysts to make sure that there is compliance with the desired outputs. When the data analysts have finished exploring data within the given research topics, a short report with the results from each research topic is produced during the results phase. These short reports can subsequently be used if a research topic and its results provide a solid basis for further research. When all research topics have been explored, the team evaluates the results and assesses if they provide any valuable and interesting insights. Based on these insights, the team will in the last phase try to create organizational knowledge, which eventually can provide actual value to the rest of the organization showing what has been learned from the research in the sprint [9].

[edit] Data Science Life Cycle Sprint Meetings

In order to run scrums efficiently, it is necessary to manage sprints carefully. A way to ensure that the sprint is on track is to hold regular meetings. For scrum teams in data science projects, it is recommended to run five different types of meetings during a sprint, which is recommended to last no longer than two weeks for data science projects [1]. These five meetings are:

  • Research Planning
  • Question Breakdown
  • Visualization Design
  • Storytelling Session
  • Team Improvement

The meetings are to be held together for every sprint to make sure that the scrum team provides interesting insights at the end of the sprint. These five types of meetings have different purposes and they all together ensure that each of the six phases of a DSLC sprint is covered properly, as is illustrated in figure 5. It is essential that these five sprint meetings are all time-boxed.

A meeting is time-boxed when the team agrees upon the duration of the meeting prior to initiating it. What will be agreed during this meeting will thus have to last for the rest of the sprint, since time-boxed meetings cannot be rescheduled or followed-up upon. This helps ensuring that a sprint is rapid and agile, focusing on exploring and experimenting rather than planning and organizing [1].

Figure 5: Data Science Sprint Meetings (Inspired by D. Rose (2016) [1]

[edit] Research Planning

Starting with the first sprint meeting, namely the Research Planning meeting, the scrum team evaluates all the questions that have been identified during the identification phase and on the question board. During this typically two-hour long meeting, the team selects the most interesting questions and the data analysts and scrum master subsequently prepares the sprint agenda. The sprint agenda is thus based on a compromise between the data analysts and scrum master to ensure minimum viable research topics exploring data but avoiding spending unnecessary time on it [1].

[edit] Question Breakdown

During the Question Breakdown 
meeting, the scrum team gathers to review any new questions
from the question board and try to
come up with some new ones. This
 type of meeting is also to identify potential clusters in the questions and whether any questions can be broken down into easier manageable questions. It is recommended to have at least two one-hour sessions of such meetings during a sprint. These meetings ensure a constant flow in the research topics and that the whole team is in compliance. It is also a good setting for prioritizing all questions and preparing the following sprint [1].

Together with the Research Planning meetings, the Question Breakdown meetings are thus ways of managing the first two phases of the DSLC sprint: the identifying and questioning phase.

[edit] Visualization Design

During the third meeting, the Visualization Design meeting, the data analysts and scrum master develop interesting visualizations together based on the results from the data analysts’ experimentations with data. The meeting should not last more than an hour and the visualizations can remain as a rough draft [1]. These visualizations will subsequently be used in the Storytelling Session.

The Visualization Design is thus a meeting that helps managing the sprint results from the DSLC result phase and creating valuable insights during the DSLC insights phase.

[edit] Storytelling Session

Based on the visualizations done by the data analysts and scrum master, the scrum team meets at a Storytelling Session to present the story of what they have learned during the sprint. The meeting should last an hour, in which data visualizations are shown, questions from the question board are discussed and the stories behind the questions are told [1].

The Storytelling Session is thus a meeting during the DSLC learning phase that helps transmitting the knowledge gained through the sprint.

[edit] Team Improvement

At the end of the sprint, the scrum team meets for a two-hour improvement meeting where the team progress is discussed and evaluated. Here, potential changes can be agreed upon before the next sprint and this helps ensuring that the team is in constant progress and learning fast from its mistakes and successes [1].

[edit] Limitations

Running a project in sprints is an ideal methodology to use when projects are of an exploratory nature or simply require fast development and deployment [1]. This perception of failing fast allows project teams to develop ideas and test them fast without having to spend too much time on planning and waiting unnecessarily long for high-level approval. Therefore, the overall focus is on developing and delivering minimum viable products and working further on them if they are a success and thus living up to the sprint goals and especially the project goals [2] [10].

However, even though APM methodologies are said to minimize risks, there are some limitations to be aware of both as a team and scrum master of DSLC sprint projects [2]. Taking offset in the PMBOK Guide and relevant literature, two main limitations will be discussed.

No interfering

Since sprints are very limited in time, and especially DSLC sprints [9], it is crucial that the team works on the sprint backlog exclusively [2]. This can be a challenge, especially in larger organizations where the risk of getting interrupted is very high due to the number of employees. Scrum teams are cross-functional and thus not teams that are necessarily used to working together. Therefore, other employees in the departments may not be aware of these sprints and reaching out to their colleagues for help as they may be used to. It is therefore important that the scrum master emphasizes the importance and priority of the sprint backlog and helps communicating this to team members' direct managers if necessary.

No objectives

As DSLC sprints are mainly recommended when working on exploratory projects like data science projects, the team needs to adjust their usual working habits from working towards specific objectives to having to develop research topics based on relevant questions during the Research Planning meeting [10]. These research topics will then serve as requirements in the sprint backlog but as data science projects are exploratory, there is no way of defining the objectives prior to having explored the data.

This way of working thus challenges the usual perception of work where there are clear objectives that have to be met before the project can come to an end [3]. As the PMBOK® Guide describes, a project end is reached when "the project's objectives have been achieved or when the project is terminated because its objectives will not or cannot be met..." [3]. This lack of specific objectives thus makes DSLC sprints unsuitable for other types of projects where there are some clear deliverables that need to be done. Such projects could, for example, be construction projects where there are a very clear start and end to the projects as well as clearly defined deliverables and success criteria. Also, these types of projects require a great deal of planning and funding, which is contradictory to the nature of DSLC projects [9].

DSLC sprints are not explicitly part of well-renowned standards like the PMBOK® Guide or PRINCE 2, however, they are a subcategory of the adaptive and agile life cycle projects as described in the PMBOK® Guide [3]. These project life cycles all take offset in the APM methodology and DSLC sprints are thus an extension of the adaptive life cycles.

[edit] Glossary

APM - Agile Project Management

DSLC - Data Science Life Cycle

SDLC - Software Development Life Cycle

CRISP-DM - Cross Industry Standard Process for Data Mining

[edit] Annotated Bibliography

Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value In: Data Science. Apress, Berkeley, CA: This book elaborates in details how to run data science projects successfully. This wiki article has primarily focused on chapters on how to deliver successful data science projects but the book also covers a lot of other relevant aspects. The first section defines what data science is. The second digs into how to build a well-functioning data science team and the last part focuses on how to ask the right questions during the Question Breakdown meetings.

Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute: This guide provides widely accepted guidelines, rules and characteristics for project, program and portfolio management. In this wiki article, the PMBOK® Guide has primarily been used for its section on project life cycles, however, it covers many more relevant subjects within project, program and portfolio management. The PMBOK® Guide for example also provides extensive guidelines within Project Scope, Time, Cost and Quality Management as well as Project Risk Management.

H. Frank Cervone (2011) Understanding agile project management methods using Scrum: This article provides a very clear understanding of what scrums are and how they relate to agile project management. Some details are addressed in this wiki article but further knowledge is to be found for the interested reader. It elaborates the origin of agile project management in the agile software development movement and provides tangible tools for managing scrum projects.

[edit] References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value - Chp. 13: Working in Sprints. In: Data Science. Apress, Berkeley, CA
  2. 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 2.10 2.11 H. Frank Cervone (2011) Understanding agile project management methods using Scrum. In: OCLC Systems & Services: International digital library perspectives
  3. 3.0 3.1 3.2 3.3 3.4 3.5 3.6 Project Management Institute (2013) A Guide to the Project Management Body of Knowledge (PMBOK® Guide). In: Project Management Institute
  4. 4.0 4.1 4.2 Office Of Government Commerce (2009) Managing Successful Projects with PRINCE2™. In: TSO
  5. Inspiration for Sprint burn-down chart: http://www.sw-engineering-candies.com/blog-1/howtouseproduct-burndown-chartsandsprint-burndown-chartsnotonlyinscrumprojects
  6. Inspiration for Release burn-down chart: http://www.softwaretestingstudio.com/burndown-chart-agile-scrum/
  7. Inspiration for Product burn-down chart: https://www.scrum-institute.org/Burndown_Chart.php
  8. Maylor H. (2010) Project Management. In: Financial Times Prentice Hall
  9. 9.0 9.1 9.2 9.3 9.4 Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value - Chp. 12: Using a Data Science Life Cycle. In: Data Science. Apress, Berkeley, CA
  10. 10.0 10.1 Rose D. (2016) Data Science: Create Teams That Ask the Right Questions and Deliver Real Value - Chp. 14: Avoiding Pitfalls in Delivering in Data Science Sprints. In: Data Science. Apress, Berkeley, CA
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox