Contribute
EN / USD
Log in / Join
32
of 76
TEP , The Engineering Projects , Image

syedzainnasir

TEP , The Engineering Projects , Rating 7.5 7.5 / 10
TEP , The Engineering Projects , Icon Level: Moderator
TEP , The Engineering Projects , Icon Joined: 20 Mar 2022
TEP , The Engineering Projects , Icon Last Active: 2:21 PM
TEP , The Engineering Projects , Icon Location: TEP , The Engineering Projects , Flag
TEP , The ENgineering Projects , Icon TEP , The ENgineering Projects , Icon TEP , The ENgineering Projects , Icon TEP , The ENgineering Projects , Icon
Tool to detect addresses via machine learning
TEP , The Engineering Projects , Calender Question: 09-Mar-2017
TEP , The Engineering Projects , Category In: Computer Software Projects
I'm currently developing a tool aiming to detect addresses (or any pattern, like job, sport team or anything) in a text.

So what I'm currently doing:

1/ Splitting the text in words 2/ Stemming the words

Users can create categories (job, sport team, address...) and will manually assign a sentence to a category.

Each stemmed word of this sentence will be stored in DB, with an updated score (+1)

When I will browse a new document, I will compute for each sentence the score thanks to all words in it.

Example:

I live in Brown Street, in London

=> (live+1, Brown +1, Street+1, London+1)

Then next time I see

I live in Orange Street, in London The score will be 3 (live +1, Street+1, London+1) so I can say "this sentence might be an address". If user validates, I update the words (live+1, orange+1, street+1, london+1). If he says "inaccurate", all words will be downvoted.

I think with more runs, I will be able to detect addresses since "Street" and "London" will have a large score (same for zip code etc)

My question is:

First, what do you think about this approach? Secondly, context is just ignored with this approach. A sentence with Street & London should have a better score. It means if I detect Street & London in the same sentence, we can likely say it's an address.

How can I store this information in a database? I'm currently using a relational database (MySQL), but I'm afraid the size will become huge if I store the link between each word.

Is it what we call a neural network? What is the best way to store it?

Do you have any tips to upgrade my detection algorithm?
TEP , The Engineering Projects , Icon Answer: 0 TEP , The Engineering Projects , Icon Views: 150 TEP , The Engineering Projects , Icon Followers: 85
Small Bio
TEP , The Engineering Projects , Tags
PLC
Robot
STM32
Arduino
AI
ESP32
Ladder Logic
PLC Projects
Programming
Communicates STM32
PLC Projects
Communicates PLC
Font Style
Alignment
Indenting and Lists
Insert Media
Insert Items

Want to leave an answer!

Word Count :0 Draft Saved at 12:42 am.