32
of 76
TEP , The Engineering Projects , Image

syedzainnasir

TEP , The Engineering Projects , Rating 7.5 7.5 / 10
TEP , The Engineering Projects , Icon Level: Moderator
TEP , The Engineering Projects , Icon Joined: 20 Mar 2022
TEP , The Engineering Projects , Icon Last Active: 2:21 PM
TEP , The Engineering Projects , Icon Location: TEP , The Engineering Projects , Flag
TEP , The ENgineering Projects , Icon TEP , The ENgineering Projects , Icon TEP , The ENgineering Projects , Icon TEP , The ENgineering Projects , Icon
I have more than 10k text files look similar like this, all of them are similar in format but not in size, sometime is bigger or smaller.
[code][{u'language': u'english', u'area': 3825.8953168044045, u'class': u'machine printed', u'utf8_string': u'troia', u'image_id': 428035, u'box': [426.42422762784093, 225.33333055900806, 75.15151515151516, 50.909090909090864], u'legibility': u'legible', u'id': 1056659}, {u'language': u'na', u'area': 24201.285583103767, u'id': 1056660, u'image_id': 428035, u'box': [223.99998520359847, 249.57575480143228, 172.12121212121215, 140.6060606060606], u'legibility': u'illegible', u'class': u'machine printed'}] [/code] I want to extract two changeable variable in every text using regular expression.

The output should be like this
[code]box = [223.99998520359847, 249.57575480143228, 172.12121212121215, 140.6060606060606] box1 = .. sometime there is more than one [/code] & second output
[code]word = troia word1 = ... sometime there is more than one word [/code] My code 1: for the word extraction
[code]fid = fopen('text1.txt','r'); C = textscan(fid, '%s','Delimiter',''); fclose(fid); C = C{:}; Lia = ~cellfun(@isempty, strfind(C,'utf8_string')); output = [C{find(Lia)}]; expression = 'u''utf8_string'': u+' matchStr = regexp(output, expression,'match');[/code] My code 1 result give me only the
[code]utf8_string[/code] My code 2: for the box number extraction
[code]s = sprintf('text_.txt'); fid = fopen(s); tline = fgetl(fid); C = regexp(tline,'u''box'': +\[([0-9\. ,]+)\]','tokens'); C = cellfun(@(x) x{1},C,'UniformOutput',false)'; M = cell2mat(cellfun(@(x) x', cat(1,C2{:}),'UniformOutput',false));[/code] This code 2 is running but not with every text something i got this error
[code]Error using cat Dimensions of matrices being concatenated are not consistent[/code]
TEP , The Engineering Projects , Icon Answer: 0 TEP , The Engineering Projects , Icon Views: 150 TEP , The Engineering Projects , Icon Followers: 85
Small Bio
TEP , The Engineering Projects , Tags
PLC
Robot
STM32
Arduino
AI
ESP32
Ladder Logic
PLC Projects
Programming
Communicates STM32
PLC Projects
Communicates PLC
Font Style
Alignment
Indenting and Lists
Insert Media
Insert Items

Want to leave an answer!

Word Count :0 Draft Saved at 12:42 am.