Smabbler :: Documentation

Main entities in Smabbler

MODEL (or TEXT MODEL)
– a configuration of the Smabbler engine consisting of query commands and specifying which information (TEXT FEATURE) to extract from text. Can be one of the predefined models (PUBLIC), or a user-defined model (PRIVATE). Each text model consists of one or more QUERY(ies)

QUERY
– is a command to the Graph Language Model to extract specified TEXT FEATURE. Query building is available via QueryLab.

CONCEPT
– a single word, topic, phrase, or value that a TEXT MODEL can extract. For example, {date value} or {colors} can be a concept, but they can also be more specific e.g. {problem description} or {health conditions}.

CONTEXT
– a single word, topic, phrase, or value that specifies a CONCEPT(s). For example, {climate} can be a concept and {emotions} can be a context.

TEXT FEATURE
– a label, annotation or classification. An output of Smabbler’s text processing, can be used as an input for later stages in the data pipeline, or consumed directly. Note, for each input text multiple text features can be produced.

QUERY LAB (text model builder)
– a configuration mechanism in Smabbler UI allowing construction of new text models. QueryLab provides an easy way to create Graph Language Model queries by choosing nodes from Galaxia graph or providing own concepts until all desired text feature criteria are selected.

GALAXIA
– graph language model (GLM). It is prepared to recognize and extracts information from text. Galaxia is a foundation model, on top of which other text models and applications can be built.

NODES
– data points in the Galaxia graph, that represent words, phrases, definitions.

EDGES
– connection between Galaxia nodes. They can populate / transfer features between nodes, not necessarily linked by direct connection.

Main entities in text analysis

RESULT
– specified TEXT FEATURE extracted from text. Result can be derived from one or multiple words

CATEGORY
– a QUERY name

MODEL
– a name of active text model used for text processing.

Application scenario 1 :
ML model training pipeline

Smabbler - in ML model training pipeline

Application scenario 2 :
BI reporting

Text Model Building

Smabbler offers a set of powerful public Text models specialized in extracting information related to a rich set of various scenarios. At the same time, many users may be interested in exploring more individual cases and specific information. To allow that, Smabbler offers a mechanism to build private Text Models (OueryLab).

Building a Text Model in UI

To build a text model in UI:
- Go to 'QueryLab'
- Select 'Create Model'
- Select scenario: file-based or freestyle
- Name your query
- Define your query by providing a Concept(s)
- Add Context if needed
- Build query
- narrow - to extract synonymous text features from text
- extended - to extract similar text features from text
- Name and define and additional Queries if needed
- Generate your model
- Activate your model
- Use your model for text processing

Comparison

GALAXIA GRAPH LANGUAGE MODEL (GLM)
VS.
NEURAL NETWORK BASED LARGE LANGUAGE MODELS (LLMs)

These are two independent technologies. Different architecture and different methods of operating.

Neural network-based LLMs require large amounts of data. To improve their performance, data, storage, and processing capacity are required. To provide reasonable latency, they need large computing resources and expensive hardware, mainly GPUs. When a language model cannot produce a relevant result, it might try to force a response that doesn't quite fit the input or is factually incorrect, which is termed "hallucinating." LLMs are generative models.

Galaxia GLM utilizes millions of nodes and the relations between them to process natural language. To enhance Galaxia's performance, new domain knowledge can be integrated, and new connections between existing nodes can be created. Galaxia runs on CPUs. When the language model doesn't have a relevant result, it doesn't attempt to provide an irrelevant answer. It is designed to identify and extract information from written language. Galaxia is an inference model.