Preprocessing Text Data for Machine Learning

Machine Learning models cannot work directly with text data. you need to encode your text data in some numeric form.

https://pixabay.com/fr/users/gdj-1086657/

Any text document is essentially just a sequence of words which you can tokenize into individual words, After transforming your document into a sequence or list of words, you can encode and represent each word in a numeric form using somekind of numeric encoding.

--

--

Data Engineer and Machine learning enthusiast with a great intrest in cloud technologies

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
AnisKHELOUFI

Data Engineer and Machine learning enthusiast with a great intrest in cloud technologies