Document Key-Value Extraction Agent

The DocKV Agent is a smart AI-powered tool that extracts key-value pairs from a given image based on input keys and their short descriptions. Currently AgentForce support three differnet versions for the docKV agent. More versions will be coming soon.

1. Agents Input Dictionary

Firstly an Agents-input Dictionary has to be defined. For a DocKV agent, the agent input dictionary is of the format :

inputs_dict = {
    "fields" : {
        # The descriptions are supposed to be spatially aware.
        "field_1" : "description_1",
        "field_2" : "description_2",
        },
    "base64_image" : img_processed,
    "image_type" : "jpeg" # PNG or Whatever image is being used
}

2. Agent's Process Functionality

After setting an inputs dictionary, we set up an agent_name.

agent_name = latentforce/doc_kv/v1
res = agent.process(agent_name = agent_name, agent_inputs = inputs_dict)
print(res)

Available DocKV versions are listed down below.

  1. latentforce/doc_kv/v1

  2. latentforce/doc_kv/v2

  3. latentforce/doc_kv/v3