Data science and engineering involves working on large sets of information, usually contained in some form of database. Normally, extracting this data requires proficiency in the Structured Query Language, simply known as SQL, to collect the exact information that the engineer requires.
SQL is similar to a programming language in that it requires well-defined syntax, and knowledge of the right keywords and selectors to operate with SQL-based databases. It’s used across many database architectures and data platforms as a shared standard to “express the desires” of data engineers. Unlike a true programming language, SQL cannot be used to build applications, but it does have conditional statements (“if this, then that” statements) and other advanced instruments.
Working with large data sets requires intimate knowledge of SQL, which can be a major obstacle for analysts who aren’t well-versed in programming, especially if data analysis is not their main role. This means that business leaders and strategists will often struggle to make data-driven decisions on their own unless they put in a significant amount of work.
Space and Time’s Houston bot, powered by OpenAI, allows developers and analysts to define a simple AI prompt in “natural” or conversational language. The chatbot will then “translate” the instructions into well-made SQL code that can be directly plugged into the database.
The integration goes further than that, as the prompts can also be used to build fully-automatic data processing, dashboards to visualize the data, and more custom scripts to process the data.
The integration is live in the Space and Time Studio, a platform providing a user-friendly interface to use the Houston bot. It works through a familiar chatbot interface like ChatGPT, and it connects directly to Space and Time’s “data warehouse,” which includes data from major blockchains like Ethereum (ETH-USD), Polygon (MATIC-USD), BNB Chain, and others. Users can also import their own data sets, including non-blockchain information from other channels.
Data Analysis Lagging in Web3 and Blockchain
Data analytics and monitoring in Web3 can be considered a nascent field, despite the fact that each blockchain’s history is fully public, while their code is nearly always open-source.
The paradox of blockchain data is that because there is so much of it, it becomes incredibly difficult to process it and find the truly valuable signals. Furthermore, blockchain software by itself makes it very difficult to query the information it contains, which is why a $1 billion+ industry of blockchain data indexing exists — with Space and Time being an example of such a startup, as well as providers like The Graph, Subquery, and others.
External indexing projects also don’t inherit the blockchain’s inherent security and anti-tampering systems, so they need to develop different methods to ensure users that the data they’re receiving is accurate. In the case of Space and Time, a cryptographic technology called Proof of SQL offers verifiable guarantees for the data’s integrity, while other projects will usually include some form of economic incentives through their tokens.
Because extracting data from blockchains is so difficult, becoming a data scientist in Web3 often requires intricate programming knowledge to interact with the raw data. Space and Time hopes that its AI-enabled efforts will help the industry develop better data analysis practices, with Scott Dykstra, CTO of Space and Time, claiming that “AI-powered SQL is a game-changer for businesses that run a lean analytics team,” explaining that with their Houston bot, “[Getting] indexed blockchain data or off-chain data from your business […] is just a few prompts away.”
Space and Time is heavily backed by Microsoft, which led a $20 million funding round through its venture arm M12 last year. It has previously added an integration with Microsoft’s Azure, a cloud computing platform, while this OpenAI integration also embeds it further in the Microsoft ecosystem, which has a significant stake in the AI trailblazer.
For some time now, especially since the release of ChatGPT in late 2022, AI has improved and disrupted a number of industries by empowering humans. Tools like GitHub Copilot, as well as ChatGPT itself, are being used by dozens of millions of users daily — with particular success in computer science. GPT-based platforms, including ChatGPT and Space and Time’s Houston, are remarkably effective and comparatively accurate when applied to coding.
AI tools still require human oversight to produce fully accurate results, so for now, they’re better seen as helpers for already experienced programmers. However, for simpler tasks like SQL queries, we’re seeing the first glimpses of a full AI revolution.