Text Recognition API
API Description
The Text Recognition API supports returning the position of the text.
Request Description
URL
https://openapi.ocr.sys303.com/api/v1/ocr/general?access_token={token}
Parameters
URL Parameters
Parameter | Value |
---|---|
access_token | The access_token obtained through the API Key and Secret Key. |
Header Parameters
Parameter | Value |
---|---|
Content-Type | application/x-www-form-urlencoded |
Body Parameters
Parameter | Required | Type | Possible Values | Description |
---|---|---|---|---|
image | One of four options | string | - | Image data, base64 encoded and URL encoded, size should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Supports jpg/jpeg/png/bmp formats. Priority: image > url > pdf_file > ofd_file |
language_type | No | string | 0 | Recognition language type, default is Tibetan [0: Tibetan, 1: Chinese]. |
url | One of four options | string | [Under development] | Full URL of the image, URL length should not exceed 1024 bytes, base64 encoded image from the URL should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Priority: image > url > pdf_file > ofd_file |
pdf_file | One of four options | string | [Under development] | PDF file, base64 encoded and URL encoded, size should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Priority: image > url > pdf_file > ofd_file |
ofd_file | One of four options | string | [Under development] | OFD file, base64 encoded and URL encoded, size should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Priority: image > url > pdf_file > ofd_file |
pdf_file_num | No | string | [Under development] | The corresponding page number of the PDF file to be recognized. When the pdf_file parameter is valid, the content of the page corresponding to the page number is recognized. If not provided, the first page is recognized by default. |
ofd_file_num | No | string | [Under development] | The corresponding page number of the OFD file to be recognized. When the ofd_file parameter is valid, the content of the page corresponding to the page number is recognized. If not provided, the first page is recognized by default. |
recognize_granularity | No | string | [Under development] | Whether to locate the position of individual characters. Options: big (do not locate individual characters, default value), small (locate individual characters). |
detect_direction | No | string | [Under development] | Whether to detect the orientation of the image. By default, it is not detected (false). Orientation refers to whether the input image is in normal, counterclockwise 90/180/270 degrees. Options: true (detect orientation), false (do not detect orientation). If input is not in a correct orientation, it is recommended to set this parameter to "true" for better recognition results. |
vertexes_location | No | string | [Under development] | Whether to return the vertex locations of the outer polygon around the text. Single character position is not supported. Default is false. |
paragraph | No | string | [Under development] | Whether to output paragraph information. |
probability | No | string | [Under development] | Whether to return the confidence level for each line in the recognition result. |
Request Example
- bash
- python
- C#
- Java
curl --request POST \
--url 'https://openapi.ocr.sys303.com/api/v1/ocr/general?access_token=【access_token】' \
--header 'content-type: multipart/form-data' \
--form 'image=【image path】' --form language_type=0
# encoding:utf-8
import requests
import base64
def main():
request_url = 'https://openapi.ocr.sys303.com/api/v1/ocr/general'
access_token= '【access_token】'
f = open('【image path】', 'rb')
img = base64.b64encode(f.read())
params = {"image": img}
request_url = request_url + "?access_token=" + access_token
headers = {'content-type': 'application/x-www-form-urlencoded'}
response = requests.post(request_url, data=params, headers=headers)
if response:
print (response.json())
if __name__ == '__main__':
main()
using System;
using System.IO;
using System.Net;
using System.Text;
public class AccurateBasic
{
public static string RunAccurateBasic()
{
string requestUrl = "https://openapi.ocr.sys303.com/api/v1/ocr/general";
string accessToken = "【access_token】";
string imagePath = "【image path】";
string fullUrl = requestUrl + "?access_token=" + accessToken;
string base64Image = GetFileBase64(imagePath);
string postData = "image=" + Uri.EscapeDataString(base64Image);
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(fullUrl);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
byte[] data = Encoding.UTF8.GetBytes(postData);
request.ContentLength = data.Length;
using (Stream requestStream = request.GetRequestStream())
{
requestStream.Write(data, 0, data.Length);
}
try
{
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
using (StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
string result = reader.ReadToEnd();
Console.WriteLine(result);
return result;
}
}
catch (WebException ex)
{
using (StreamReader reader = new StreamReader(ex.Response.GetResponseStream()))
{
string errorResponse = reader.ReadToEnd();
Console.WriteLine("Error Response:");
Console.WriteLine(errorResponse);
}
throw;
}
}
public static string GetFileBase64(string filePath)
{
byte[] fileBytes = File.ReadAllBytes(filePath);
return Convert.ToBase64String(fileBytes);
}
public static void Main(string[] args)
{
RunAccurateBasic();
}
}
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLEncoder;
import java.util.Base64;
public class RunAccurateBasic {
public static void main(String[] args) {
String requestUrl = "https://openapi.ocr.sys303.com/api/v1/ocr/general";
String accessToken = "【access_token】";
String imagePath = "【image path】";
try {
String result = runAccurateBasic(requestUrl, accessToken, imagePath);
System.out.println("OCR Result: " + result);
} catch (Exception e) {
e.printStackTrace();
}
}
public static String runAccurateBasic(String requestUrl, String accessToken, String imagePath) throws Exception {
String imageBase64 = encodeFileToBase64(imagePath);
String fullUrl = requestUrl + "?access_token=" + accessToken;
URL url = new URL(fullUrl);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("POST");
connection.setDoOutput(true);
connection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
String params = "image=" + URLEncoder.encode(imageBase64, "UTF-8");
try (OutputStream outputStream = connection.getOutputStream()) {
outputStream.write(params.getBytes("UTF-8"));
}
int responseCode = connection.getResponseCode();
if (responseCode == 200) {
StringBuilder response = new StringBuilder();
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
String line;
while ((line = reader.readLine()) != null) {
response.append(line);
}
}
return response.toString();
} else {
StringBuilder errorResponse = new StringBuilder();
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
String line;
while ((line = reader.readLine()) != null) {
errorResponse.append(line);
}
}
throw new Exception("Error Response: " + errorResponse.toString());
}
}
public static String encodeFileToBase64(String filePath) throws IOException {
File file = new File(filePath);
try (FileInputStream fis = new FileInputStream(file)) {
byte[] fileBytes = new byte[(int) file.length()];
fis.read(fileBytes);
return Base64.getEncoder().encodeToString(fileBytes);
}
}
}
Return Explanation
Parameter Description
Field | Required | Type | Status | Description |
---|---|---|---|---|
log_id | Yes | uint64 | - | Unique log ID used for troubleshooting. |
words_result_num | Yes | uint32 | - | Number of recognition results, representing the number of elements in words_result. |
paragraphs_result_num | Yes | uint32 | Number of recognition results, representing the number of elements in paragraphs_result. | |
words_result | Yes | array[] | - | Array of recognition results. |
+ words | No | string | - | Recognized text string. |
+ location | No | string | [In Development] | String location information. |
++ top | No | string | [In Development] | Vertical coordinate of the top-left corner of the bounding rectangle for location. |
++ left | No | string | [In Development] | Horizontal coordinate of the top-left corner of the bounding rectangle for location. |
++ width | No | string | [In Development] | Width of the bounding rectangle for location. |
++ height | No | string | [In Development] | Height of the bounding rectangle for location. |
+ probability | No | object | [In Development] | Confidence values for each line of recognition results, includes average (line confidence average), variance (line confidence variance), min (minimum line confidence). This field is returned when probability=true. |
paragraphs_result | No | array[] | [In Development] | Paragraph detection results, returned when paragraph=true. |
+ words_result_idx | No | array[] | [In Development] | Line numbers included in a paragraph, returned when paragraph=true. |
pdf_file_size | No | string | [In Development] | Total page count of the input PDF file, returned when the pdf_file parameter is valid. |
ofd_file_size | No | string | [In Development] | Total page count of the input OFD file, returned when the ofd_file parameter is valid. |
direction | No | int32 | [In Development] | Image orientation, returned when detect_direction=true. -1: Undefined - 0: Normal - 1: 90° Counterclockwise - 2: 180° Counterclockwise - 3: 270° Counterclockwise |
Return Example
Success
{
"logId": 0,
"words_result_num": 2,
"words_result": [
{
"words": "Tibet OCR",
"location": {
"left": 0,
"top": 0,
"width": 0,
"height": 0
}
},
{
"words": "Changyuan Shengbang Technology Co., Ltd.",
"location": {
"left": 0,
"top": 0,
"width": 0,
"height": 0
}
}
]
}
Failure
{
"error_code": 110,
"error_msg": "Access token invalid or no longer valid"
}