How to develop a real-time tracking and analytics system for your apps

advices
concepts
devops
How to develop a real-time tracking and analytics system for your apps

Tracking and monitoring software can make a big difference for IT related businesses. If they are selling either a service or an actual product, data about the different system functions and user actions can be a deal breaker. Of course, one of the most important issues is user privacy, especially now that GDPR became the new standard in user data protection, but this is not the main topic of the article.

It’s very important for both startups and businesses to understand that tracking and monitoring is a must have, as it can turn the tide between failure and success. A proper solution for tracking and monitoring will help in the following aspects:

  • system behaviour under different types of load;
  • system outages;
  • error and crash reporting;
  • anonymous tracking of user actions;
  • tracking of load times;
  • revenue statistics.

Such a solution would help an IT business optimise its systems, find errors and bugs, identify system outages, observe the system’s behaviour under different loads, track the revenue and so on.

Even though developing a solution for tracking and monitoring isn’t inexpensive, it definitely provides a lot of value for the business. Let’s exemplify using a subscription audio streaming service. Using a custom built solution for tracking and monitoring would allow the business to track the number of users, the number of successfully processed and failed subscriptions, load times when providing the content, identify bottlenecks when the load spikes, etc. Such a solution would help the business fix subscription errors, optimise content delivery based on actual usage information, identify timeframes of high load and properly scale their web infrastructure, etc.

Why not just use a ready-made solution?

While solutions like Google Analytics or Flurry Analytics should serve well in some instances, they are limited to very specific functionalities and don’t allow customised tracking and monitoring.

Another issue of these solutions is their pricing. Both solutions are expensive when the number of events reported per timeframe exceeds a specific limit. Another CON of using a ready-made solution is that most of them don’t grant access to raw data, which is needed in order to aggregate custom reports after a while.

Which general feature set would be appropriate for a real-time tracking and analytics system?

The purpose of such a system should be to solve business specific problems and help the business and technical teams in their efforts. There are some general features such a system should have.

  1. Raw data archiving
  2. Real-time tracking
  3. Real-time Visualisations
  4. Automated Reporting
  5. Flexibility in defining tracking KPIs
  6. Respect user privacy and secure all the information
  7. Tracking of online servers and services
  8. Error and crash tracking and reporting
  9. Tracking of load times on the client side
  10. Business intelligence tool or allow integration with a BI specific tool.

Technology stack for developing a real-time tracking and analytics system

HyperSense’s suggestion for the technology stack to be used when developing such a system is described bellow.

Ingestion system

Depending on the specific requirements, AWS Kinesis Data Firehose can be used as an ingestion system. Otherwise, you can opt for a Node.js custom developed API that writes data into a messaging queue like RabbitMQ or AWS SQS. For the ingestion system, we consider two aspects be of great importance: rapid ingestion and raw data archivation.

Data aggregation layer

The raw data must be aggregated in order to permit fast queries. AWS Kinesis Analytics is a managed service option for this layer. If you require a custom solution, we would suggest Node.js microservices.

Aggregated database

We recommend using Elasticsearch because it has the following features: a developer-friendly API, real-time analytics, ease of indexing, full-text search, resilient clusters. Another key factor would be the integration with Kibana.

Visualisation tool

Our first option would be Kibana. When Kibana is not the perfect solution for visualisations, our second option would be a Node.js, AngularJS or Express.js solution using the Inspinia theme.

Cold storage

AWS S3 is our choice of a cold storage solution of raw data. The raw data can be of tremendous help both for the business and development teams. We suggest storing it in CSV formatted files.

Query service on raw data

When you less expect, situations arise when you need to query the raw data and there is no time for aggregation or implementation of a custom aggregation algorithm. Our choice of a S3 stored CSV file query service is AWS Athena.

Notification system

Team members must be kept up to date with the latest changes, and sometimes informed of the latest issues. Our first choice is AWS Simple Notification System, because it can send multiple types of messages (email, SMS, make HTTP calls).

Orchestration system for 3rd party integration

The orchestration system will allow easy integrations with other solutions, like Jira for automated bug or crash reporting. Another use case is reporting revenue information to an affiliate. For the orchestration system we would use Node.js with Loopback.

 

If you are looking to develop a real-time tracking and analytics system or if you have any questions related to this topic please feel free to Contact Us.