Skip to content

Project 3: Sprint 2

This project assignment is worth 700 points.

This homework is to be done as a team.

P3 is due on Gradescope Wednesday, April 8, 2026 11:59pm.

Note

There's an earlier due date because of CMU Carnival.

Submission

Submit P3 as a PDF file to Gradescope.

If you prepare the response in some other software (like Tex), please export as PDF before submitting. Include your name and Andrew ID at the top of the document.

Learning Goals

  • Use an LLM to generate human-verifiable documentation (wireframes and development specifications) that describe the behavior of software artifacts.
  • Create machine-verifiable test cases that demonstrate that implemented pieces of functionality correctly satisfy the intended user stories and development specification.

1. Project Context

In the last assignment, you used the LLM to create the user stories, user interface, and code for your startup application. But, how do you validate what the LLM actually created and tell whether what was produced is what you wanted?

In this project assignment, you will:

  1. Use the LLM to create machine-executable test cases (unit tests and acceptance tests) that show that the UI and code satisfies the requirements described in the user stories.
  2. Use the LLM to create documentation for the UI and code you have created that will convince you and your users that you have built the product defined by your user stories.

4. Development Specification (300 points)

Choose 3 user stories from P2 and ask an LLM document them into 3 separate dev specs. Each dev spec (which will be very long) should include exactly the following sections with enough detail that you (or some other human) could learn the following information:

  1. The primary and secondary owners of the user story.
  2. The date the code was merged into the main branch.
  3. An architecture diagram in Mermaid that also illustrates where the components of the application execute (e.g., client, server, cloud, etc.).
  4. An information flow diagram in Mermaid that shows which user information and application data moves between architectural components and the direction in which they flow.
  5. A class diagram in Mermaid that shows the classes relevant to the user story's implemention in superclass/subclass relationships. (Don't accidentally leave any any classes or interfaces from your diagram. We're going to check.)
  6. A list of all of the classes in the implementation relevant to this user story. Each class must include a list of public and private fields and methods with explanations for the purpose for each. First list the public fields and methods together, then list the private fields and methods. Each list should be grouped by concept.
  7. A complete list of technologies/libraries/APIs used in your system that you aren’t writing yourself.
    1. Indicate what the technology is being used for.
    2. Indicate why that technology was picked over others.
    3. Provide a URL to source location, author, and documentation for each technology.
      1. Don’t leave out technologies like language, common libraries, or necessary tools.
    4. Do mention the required version number for each technology.
  8. A list of each data type that you will be storing in a database (i.e. long-term storage). Explain the purpose of each field in the database. How much storage (in bytes) will it require?
  9. A list of the user-visible and internally-visible effects if your frontend application:
    1. crashed its process
    2. lost all its runtime state
    3. erased all stored data
    4. noticed that some data in the database appeared corrupt
    5. remote procedure call failed
    6. client overloaded
    7. client out of RAM
    8. database out of space
    9. lost its network connectivity
    10. lost access to its database
    11. bot signs up and spams users
  10. A list all Personally Identifying Information (PII) stored in long-term storage in your system. This is information about your users that could be used by an evil actor to steal their identity.
    1. Justify why you need to keep each data item in storage.
    2. How exactly is it stored?
    3. How did the data enter your system?
    4. Through what modules, components, classes, methods, and fields did it move before entering long-term storage?
    5. Through what modules, components, classes, methods, and fields will it move after leaving long-term storage?
    6. List the people on your team who have responsibility for securing each unit of long-term storage/database.
    7. Describe your procedures for auditing routine and non-routine access to the PII data.
    8. Is the PII of a minor under the age of 18 solicited or stored by the system?
      1. Why?
      2. Does your application solicit a guardian’s permission to have that PII?
      3. What is your team’s policy for ensuring that minors' PII is not accessible by anyone convicted or suspected of child abuse?

Follow the required software process. Commit each dev spec into the repository. Then, wrap up the commits into a pull request (PR) on GitHub and submit it. Have a teammate review your dev spec in the PR and iterate until they are satisfied with the quality of the spec. Approve the PR and merge it into main.

  1. Turn in a URL to each dev spec's PR.
  2. If you used an LLM for any part of this section, please turn in a copy of the chat log or a URL to the chatlog that is accessible to all of the 17-356 instructors.

5. Machine-Executable Unit Test Cases (200 points)

One way to evaluate whether your implementation is good enough is to test it. You will create unit tests for each class that can be executed by the computer (not an LLM) to tell you if the code meets your specifications.

Look through the source code of your frontend and identify 10 code files that contain the most core functionality that implements your frontend user stories. Only choose files that have at least 5 functions.

Most projects that are written in Javascript or TypeScript should use the Jest unit testing framework. You may use another framework, but you may not manually test the code; you must use a testing framework.

First, install the test framework into your application. Create a tests/ folder and keep all of your test files there. Next, prompt an LLM to write unit tests for each function in the code file. Remember, do not ask the LLM to write all the unit tests in a single prompt. Only generate one test at a time. There must be at least one unit test for every function, but there may very well be several unit tests per function required to fully test its operation. If a mock for external functionality is needed, have your test framework create it for you.

For example, you have a validateEmail(string address) function to test. One possible test may check whether GMail addresses are considered valid. The input address would be "realemailaddress@gmail.com" and the expected output would be the boolean "true".

At the end, you should have one test file for each tested class.

Note

You will be graded on how well you prevent the LLM from hallucinating nonsensical test cases or creating duplicate or significantly overlapping test cases. The LLM must not generate test cases for functions and functionalities that do not exist.

Ask the LLM to generate scripst to setup the frontend and backend of your application and then execute the tests with your testing framework.

How did it go? Did every test pass? If not, use your LLM to give you a plan on how it wants to fix the bugs (ask it for three alternative fixes). Choose the bug fix you like and have the LLM make the change. Did your test case pass? Congratulations! If not, try again.

  1. Create a unit test feature in your GitHub Issues database for each class you are testing. Make sure to keep the status of this Issue up to date on your Kanban board.
  2. Once you are done generating and testing your test code, commit all of the test code to the repository.
  3. Wrap up the commits into a pull request (PR) on GitHub and submit it.
  4. Have a teammate or LLM do a code review on your PR and iterate until they are satisfied with the quality of the test code. Turn in the raw notes (or recording) from your human code review or the checklist you used and the LLM output if you asked an LLM to review your code.
  5. Approve the PR and merge it into main. Remember to keep the status of the user story up to date on your Kanban board.
  6. Turn in a URL to each class's unit test PR, identifying which class the PR was testing.
  7. If you used an LLM for any part of this section, please turn in a copy of the chat log or a URL to the chatlog that is accessible to all of the 17-356 instructors.

6. Human-Verifiable Acceptance Test Cases (200 points)

User stories are difficult to machine-verify since they require satisfying a human user. Correctness is necessary, but definitely not sufficient. You will create acceptance tests that will guide a human to evaluate the functionality of each user story and ask them to vote on whether the user story has been successfully accomplished.

Pick 10 user stories from your application that have been implemented. For each user story, prompt an LLM to generate a sequence of instructions for a human user to follow using your application to accomplish the action specified by the user story. Then prompt the LLM to create a 3-question survey for the user to provide feedback on whether they believe the user story has been achieved. You must decide on three metrics that will definitely show that the user is satisfied with the user story (enough that they would spend money to buy your app or service).

  1. Create a human acceptance test feature in your GitHub Issues database for each user story you are testing. Make sure to keep the status of this Issue up to date on your Kanban board as you work through this assignment.
  2. For each user story,
  3. Provide the instructions to a human user to use the action/feature specified in the user story. Describe any prerequisite actions required to get the system set up in the right configuration for the user to begin.
  4. List the three most salient metrics that will get at whether the user is satisfied. Explain why you picked these three. Remember to consult The Lean Startup, The Mom Test, and the metrics we introduced in lecture to help you understand the value of each metric you have chosen.
  5. List the three questions in your survey that get the user to tell you about their feelings about the user story. Watch out here. Sometimes you can't be super direct in asking because your users will distrust your motives and make fun of your question instead
    1. Example: Microsoft Windows pops up dialogs to ask "How likely are you to recommend Windows to a friend or colleague (Likert score 1-5) Please explain your score." One snarky user responded, "1 Not at all likely. I need you to understand that people don't have conversations where they randomly recommend operating systems to one another."
  6. Choose a classmate from a different team in this class and ask them to try out your instructions and take the survey. Were they satisfied? If not, improve your implementation and try again. You may have the same classmate try all of your user stories. Identify the classmate and report all of their survey responses (each time they take the survey) in your submission.
  7. Once you are done generating and testing your acceptance test procedures and surveys, commit everything to the repository.
  8. Wrap up the commits into a pull request (PR) on GitHub and submit it.
  9. Have a teammate review your PR and iterate until they are satisfied with the quality of the acceptance test procedures and survey.
  10. Approve the PR and merge it into main. Remmeber to keep the status of the user story up to date on your Kanban board.
  11. Turn in a URL to each class's acceptance test PR, identifying which user story the PR was testing.
  12. If you used an LLM for any part of this section, please turn in a copy of the chat log or a URL to the chatlog that is accessible to all of the 17-356 instructors.