Data Collection
At Market Compass, we employ a robust data gathering and storage system to collect and organize information from various sources, providing our users with comprehensive insights into the cryptocurrency market.
Here's an overview of our data gathering process and the non-SQL database used for storage:
Data Gathering
Source Variety:
We gather data from diverse sources, including social media platforms (such as X/Twitter), online forums (like 4chan), newsletters, Discord channels, Telegram groups, and more.
Each source provides a unique perspective and contributes to a comprehensive understanding of market sentiment and trends.
Mining Activity:
Our mining activity, facilitated through Subnet 17 on the Commune AI chain, collects real-time data from X (Twitter) API queries.
Miners receive different queries to perform every minute, ensuring a diverse and up-to-date dataset.
Queries are distributed among miners, each responsible for gathering data on specific cryptocurrency-related topics.
Data Scrapping and Collection:
In addition to X (Twitter) API queries, we utilize web scraping techniques to extract data from platforms like 4chan, newsletters, Discord, and Telegram.
This multi-source approach allows us to capture a wide range of opinions, sentiments, and discussions within the cryptocurrency community.
Non-SQL Database Storage
Choice of Database:
We utilize a non-SQL (NoSQL) database for storing data.
NoSQL databases offer flexibility and scalability, making them well-suited for handling unstructured and semi-structured data, which is common in our data sources.
Scalability and Performance:
The non-SQL database architecture allows us to scale horizontally to accommodate growing data volumes and user demand.
This ensures optimal performance even as the dataset expands over time.
Schema-less Design:
NoSQL databases typically employ a schema-less design, allowing for agile development and adaptation to changing data structures.
This flexibility enables us to seamlessly integrate new data sources and modify existing schemas as needed.
Data Indexing and Retrieval:
We implement efficient indexing techniques to facilitate fast data retrieval and analysis.
Indexing enables quick access to relevant data points, empowering our analytics algorithms to generate insights in real-time.
Data Replication and Redundancy:
To ensure data durability and availability, we implement data replication and redundancy strategies within the non-SQL database architecture.
This redundancy mitigates the risk of data loss and provides fault tolerance in the event of hardware failures or system outages.
Last updated