Skip to content

What is 倾语AI?

倾语AI is an offline intelligent assistant that integrates LLM and speech recognition. It can convert incoming audio and video files into custom structured PDF documents, achieving seamless export from speech to knowledge.

Applicable scenarios: meeting minutes generation, interview record organization, academic note export, report writing, automatic filling of legal/medical documents, etc.

Target users: professionals, researchers, students, administrative staff, etc. who need to frequently process speech to text and generate structured documents.

Core functional highlights

  • High accuracy speech recognition: Supports voice recognition of audio and video files in over 60 languages worldwide, including Chinese, English, Japanese, Korean, and German. Supports 21 file formats such as .mp3, .wav, .ape, .flac, .mp4, and .avi.
  • Speaker recognition: Supports distinguishing speakers, that is, which speaker said a certain sentence. After recognition, speakers can be edited and merged. This function is for experimental purposes.
  • LLM semantic optimization: Automatically correct recognition errors, remove mood particles, polish grammar, and extract abstracts.
    • Privacy/Performance Dual Mode:
      • Privacy mode (default), all data processing is completed on the user's device, and all data is not connected to the internet, making it very suitable for users who are sensitive to data leaks and value privacy and security.
      • Performance mode refers to the user enabling the API interface function of external third-party models. In this mode, the speech recognition part is still processed offline, only the LLM part, and the data will be transmitted to third-party AI for processing. This mode is suitable for users who have insufficient device configuration and need to use stronger AI models.
  • CPU and GPU hybrid inference: Supports pure CPU, pure GPU, and CPU+GPU hybrid inference, maximizing the use of local device computing power to enhance your work efficiency.
  • Intelligent PDF export:
    • Support highly customizable export templates (users can design layouts, fonts, colors, logos, etc.).
    • Data label mapping: Preset labels in the template (such as decision items, summary content), and the application will hand these labels over to the AI big model to help you organize the content.
    • Intelligent filling of LLM: The LLM automatically extracts corresponding content from the recognized text and fills in the label positions without the need for manual input.
    • Post processing: If you are not satisfied with the document content intelligently exported by the LLM, you can not only continue to let the LLM re export, but also modify the document content yourself and export the final PDF document by yourself.
    • Export History: All of your exported document data will be stored on your local device for you to modify/re export PDF documents at any time.

Overview of workflow

flowchart TD
    A["Input voice (upload file)"] --> B["Speech recognition generates draft"]
    B --> C["Optimizing Text with LLMs"]
    C --> D["Select/Design Export Template"]
    D --> E["Matching labels with LLMs and filling them in"]
    E --> F["Fine tune content and export PDF"]

System requirements and compatibility

  • Operating System: Windows 10 (version 1607 or higher) / 11 64 bit
  • Hardware: 4GB or more memory, it is recommended to have a graphics card that supports Vulkan (Vulkan is generally supported by graphics cards, and CPU/GPU is optional for large models and audio text transcription inference). Higher hardware configuration enables the use of better models, resulting in better performance and efficiency.
  • Networking: The download of speech recognition models and large models requires networking. In addition, only the first activation of the application requires networking, and after activation, it can be used purely offline indefinitely. There is no mandatory networking requirement for the entire application's functionality, unless the user enables the use of external AI models on their own.

Version information and update channels

  • Current version number: 1.0.0
  • Updated on: April 3rd, 2026
  • Update and Download: Go to the Microsoft Store to update and download, click to go.