Update README with assumptions, approach, next steps

author: Szymon Szukalski <szymon@skas.io> 2024-07-26 11:15:52 +1000
committer: Szymon Szukalski <szymon@skas.io> 2024-07-26 11:15:52 +1000
commit: a17f0cc0eda4b862a223cb5578484245bdcf9517 (patch)
tree: eb4f2e5ea3ea99d1c7d4f3fc2e7a462bf4d6ea21
parent: 0b24d43fc4517e92025efc552ff7c1cc6ddaff2e (diff)
2 files changed, 80 insertions, 11 deletions
diff --git a/README.md b/README.md
index 1125903..8d9b6af 100644
--- a/README.md
+++ b/README.md
@@ -2,22 +2,91 @@
 
 **Szymon Szukalski's Submission to the Stile Coding Challenge - July 2024**
 
+## Assumptions
+
+**Deployment**
+- Encryption is not required
+- Authentication is not required
+- This is a standalone service used by a single institution
+    - In order to support multiple institutions, the test results would need to be tied to the institution somehow.
+- Imported data should persist between restarts of the service
+
+**Import endpoint**
+- That all the fields in the supplied XML document are mandatory and cannot be blank
+- That the `student-id` is unique
+- That marks obtained will never be higher than marks available
+- Ignoring the `<answer/>` elements during import as all the necessary information is contained in the `<summary-marks/>` elements
+- The endpoint may receive other XML documents which don't conform to the samle XML format and that they should be rejected.
+- Duplicate entries should be handled gracefully
+
+**Aggregate endpoint**
+- That standard deviation is a requirement as it is included in the expected output:
+
+```shell
+curl http://localhost:4567/results/1234/aggregate
+{"mean":65.0,"stddev":0.0,"min":65.0,"max":65.0,"p25":65.0,"p50":65.0,"p75":65.0,"count":1}
+```
+
+## Approach
+
+- Implemented the application using Spring Boot, leveraging my Java background and the framework’s ease of setup for microservices.
+- Created Java classes to represent the XML payloads
+- Used PostgreSQL for data storage
+- Designed entity models for Student, Test, and TestResult to map database tables and manage relationships.
+- Applied common Spring patterns:
+    - Used JPA for ORM to interact with the PostgreSQL database.
+    - Defined RESTful endpoints with RestController to handle HTTP requests.
+    - Created repository interfaces for managing CRUD operations.
+    - Encapsulated business logic in service classes to maintain separation of concerns.
+- Developed unit and integration tests to validate REST endpoints and service functionality.
+- Set up a multi-stage Docker Compose file to build, test, and run the application independently, supporting CI/CD pipelines.
+
+**Data Model**
+![Markr Data Model](markr_data_model.png)
+
+## Next Steps / Recommendations
+
+**Codebase Maturity**
+- Document the API with endpoint details, request/response examples, error handling, and authentication.
+- Expand test-suite with more test cases
+- Integrate into CI/CD
+
+**Security and Monitoring**
+- Implement SSL, encryption, and authentication to secure communication, encrypt data at rest, and control access.
+- Create health check endpoints to monitor service status and integrate with an observability platform.
+- Implement real-time performance monitoring to track system responsiveness and resource usage. 
+
+**Performance**
+- Use profiling tools to identify and resolve performance bottlenecks in data processing and display.
+- Index frequently queried fields to speed up query execution.
+- Review and optimize SQL queries to reduce execution time.
+
+**Supporting Real-time Dashboards**
+- Implement real-time data streams with technologies like Apache Kafka or WebSockets to push live updates.
+- Set up an event-driven architecture to automatically refresh the dashboard when new data becomes available.
+- Calculate and store aggregate data in advance to speed up updates and reduce processing during real-time refreshes.
+- 
+**Alternate Pipeline Approach**
+- Use an ELT approach by storing the original XML payload in block storage before loading and transforming it.
+- Storing the original XML payload enables updating the transformation logic and reprocessing the original data with new logic if needed.
+- Leverage the event-driven architecture to facilitate replaying and correcting results if bugs are fixed in the transformation logic, allowing for data reprocessing and validation.
+
 ## Running the Project
 
 This project uses Docker for containerization.
 
 ### Docker CLI Version
 
-The project uses the latest version of the Docker CLI (version 27.1), which includes the integrated `docker compose`
+The project uses the latest version of the Docker CLI (version 27.1), which includes the integrated `docker compose`  
 command for managing multi-container Docker applications.
 
 ### Build
 
 To build the Docker image for the application, run the following command:
 
-```shell
+```shell  
 docker compose build
-```
+```  
 
 This command will build the application image using the Dockerfile defined in the project.
 
@@ -25,30 +94,30 @@ This command will build the application image using the Dockerfile defined in th
 
 To run the tests using Docker, use the following command:
 
-```shell
+```shell  
 docker compose run --rm tests
-```
+```  
 
-This command will build the Docker image (if not already built), start a container for testing, and run the tests. The
+This command will build the Docker image (if not already built), start a container for testing, and run the tests. The  
 --rm flag ensures that the test container is removed after the tests complete.
 
 ### Run
 
 To run the application, use the following command:
 
-```shell
+```shell  
 docker compose up postgres service
-```
+```  
 
-This command will start the application and its dependencies (like PostgreSQL) as defined in the docker-compose.yml
+This command will start the application and its dependencies (like PostgreSQL) as defined in the docker-compose.yml  
 file.
 
 ### Cleanup
 
 To stop the running containers and remove them along with associated volumes, use:
 
-```shell
+```shell  
 docker compose down -v
-```
+```  
 
 This command will stop and remove the containers and any associated volumes.
 \ No newline at end of file
diff --git a/markr_data_model.png b/markr_data_model.png
new file mode 100644
index 0000000..3479fec
--- /dev/null
+++ b/markr_data_model.png
author	Szymon Szukalski <szymon@skas.io>	2024-07-26 11:15:52 +1000
committer	Szymon Szukalski <szymon@skas.io>	2024-07-26 11:15:52 +1000
commit	a17f0cc0eda4b862a223cb5578484245bdcf9517 (patch)
tree	eb4f2e5ea3ea99d1c7d4f3fc2e7a462bf4d6ea21
parent	0b24d43fc4517e92025efc552ff7c1cc6ddaff2e (diff)