Transcription & Speech
AssemblyAI
AssemblyAI offers their own model, Universal-1 trained on a staggering 12.5M hours of multilingual audio. It claims to have the highest accuracy in the industry and benchmarks back it up. Additionally, Universal-1 API performance is extremelly fast, capable of transcribing an hour of audio in about a minute.
Whisper
OpenAI's MIT Licensed general purpose speech recognition model is an absolute beast. With multi-lingual support, strong performance, and multiple model sizes to choose from, Whisper is the best open source transcription model to start with. Many companies host implementations of Whisper making it a simple API call.
AssemblyAI
AssemblyAI offers their own model, Universal-1 trained on a staggering 12.5M hours of multilingual audio. It claims to have the highest accuracy in the industry and benchmarks back it up. Additionally, Universal-1 API performance is extremelly fast, capable of transcribing an hour of audio in about a minute.
Whisper
OpenAI's MIT Licensed general purpose speech recognition model is an absolute beast. With multi-lingual support, strong performance, and multiple model sizes to choose from, Whisper is the best open source transcription model to start with. Many companies host implementations of Whisper making it a simple API call.
Data Cleaning & Extraction
LlamaIndex
Where data processing competitors focus on the data, LlamaIndex offers a suite of capabilities to take data from parsing, to ingestion, to retrieval by enabling capabilities to support applications based on LLMs. LlamaIndex is thinking broadly about enabling their capabilities for Enterprise use to hopefully make these types of processes simple in the future.
Unstructured
Unstructured is a collection of capabilities, both API/SaaS and Open Source, which assist in processing documents. Outputs from Unstructured may be useful in training AI models, serving up data for RAGs, or useful in extracting information for direct processing by non AI systems. Supporting a wide range of documents and platforms connections it is worth a look to see how much time you can save by processing documents with their capabilities.
LlamaIndex
Where data processing competitors focus on the data, LlamaIndex offers a suite of capabilities to take data from parsing, to ingestion, to retrieval by enabling capabilities to support applications based on LLMs. LlamaIndex is thinking broadly about enabling their capabilities for Enterprise use to hopefully make these types of processes simple in the future.
Unstructured
Unstructured is a collection of capabilities, both API/SaaS and Open Source, which assist in processing documents. Outputs from Unstructured may be useful in training AI models, serving up data for RAGs, or useful in extracting information for direct processing by non AI systems. Supporting a wide range of documents and platforms connections it is worth a look to see how much time you can save by processing documents with their capabilities.
Hosting & Inference
Deepinfra
A model hosting platform with numerous imagery, text, and multimodal modals. Deepinfra also offers 1 billion free tokens as part of their DeepStart program for startups meeting certain criteria. Deploying your own models are possible offering staggering prices for access to A100/H100s. We have used Deepinfra for generating synthetic data from license free open source models and loved the experience and price.
Groq
Initially a hardware company focused on building optimized processors ideal for LLMs, Groq has now pivoted as an inferencing provider. We can spout off a bunch of metrics as to how much faster their LPU is at hosting models but the results speak for themselves. Go try out their chat interface and enjoy how quickly results are returned. We haven't used their LPUs yet, but color us impressed so far.
Fireworks
Fireworks provides serverless hosting of many of the top open source AI model. They also allow customers to fine tune LoRAs for use on their platform to tailor serverless models to your specific needs. For Enterprise customers they also offer dedicated deployments using A100s.
Deepinfra
A model hosting platform with numerous imagery, text, and multimodal modals. Deepinfra also offers 1 billion free tokens as part of their DeepStart program for startups meeting certain criteria. Deploying your own models are possible offering staggering prices for access to A100/H100s. We have used Deepinfra for generating synthetic data from license free open source models and loved the experience and price.
Groq
Initially a hardware company focused on building optimized processors ideal for LLMs, Groq has now pivoted as an inferencing provider. We can spout off a bunch of metrics as to how much faster their LPU is at hosting models but the results speak for themselves. Go try out their chat interface and enjoy how quickly results are returned. We haven't used their LPUs yet, but color us impressed so far.
Training & Exploration
RunPod
In the early days of Google Colab you could log in and be granted access to an nVidia Tesla V100; no longer. Web enabled access to compute (GPU/CPU) is so delightful when trying to load up a notebook for experiment or model training. RunPod is our recommendation as the best of the best. We have used it extensively and can report that we have had no issues getting the GPUs we need.
RunPod
In the early days of Google Colab you could log in and be granted access to an nVidia Tesla V100; no longer. Web enabled access to compute (GPU/CPU) is so delightful when trying to load up a notebook for experiment or model training. RunPod is our recommendation as the best of the best. We have used it extensively and can report that we have had no issues getting the GPUs we need.