5 tips for pandas usersA popular Python library used by those working with data is pandas, an easy and flexible data…
Continue Readingwords
Double words
Double words such as “the the” are a common source of writing errors. On the other hand, some doubled words…
Continue ReadingMDS codes
A maximum distance separable code, or MDS code, is a way of encoding data so that the distance between code…
Continue ReadingQuick Introduction to Bag-of-Words (BoW) and TF-IDF for Creating Features from Text
The Challenge of Making Machines Understand Text Language is a wonderful medium of communication. You and I would have understood…
Continue ReadingA gentle introduction to Hamming codes
The previous post looked at how to choose five or six letters so that their Morse code representations are as…
Continue ReadingQueueing theory and regular expressions
To find out, I did an experiment like the classic examples from Kernighan and Pike demonstrating regular expressions by searching…
Continue ReadingLemma, Lemma, Red Pyjama: Or, doing words with AI
It depends. Capitalization by itself doesn’t usually change the meaning of a word, so a computer can usually be told…
Continue ReadingModels for Thinking: An Example of Why Data Sciences Increasingly Need the Humanities
The answer, often, is to configure them into abstract but coherent topics. As a consequence, the software-based output of your…
Continue ReadingEstimating vocabulary size with Heaps’ law
Heaps’ law says that the number of unique words in a text of n words is approximated byV(n) = K nβwhere…
Continue ReadingNLP Tutorial: MultiLabel Classification Problem using Linear Models
NLP Tutorial: MultiLabel Classification Problem using Linear ModelsGeorgios DrakosBlockedUnblockFollowFollowingJun 16This article presents in details how to predict tags for posts from…
Continue ReadingUsing unsupervised machine learning to uncover hidden scientific knowledge
We can get some clues by looking at the context words of the predicted materials, and see which of these…
Continue ReadingHow to use NLP to Analyze WhatsApp Messages
How to use NLP to Analyze WhatsApp MessagesMaarten GrootendorstBlockedUnblockFollowFollowingJun 27On 17 August 2018, I married the woman of my dreams and…
Continue ReadingQuantifying Chatroom Toxicity
Courtesy of Caspar Camille Rubin on UnsplashQuantifying Chatroom ToxicityUsing Machine Learning to Identify Hate in Online ChatroomsJeremy ChowBlockedUnblockFollowFollowingJun 25Note: Vulgar language examples…
Continue ReadingOptimizing Source-Based-Language-Learning using Genetic Algorithm
Most would choose option #2 — because a sentence is closer to what I call a ‘cashable result’. Let’s look at some…
Continue ReadingProcessing Text data in Natural Language Processing
` * } @ : ; ^ |= &= += -= = /= *=Morphological NormalizationThis type of normalization is needed when there…
Continue ReadingA Game of Words: Vectorization, Tagging, and Sentiment Analysis
With Bag of Words, you can perform a logistic regression or other classification algorithm to show what documents (rows) within…
Continue ReadingDifferent techniques to represent words as vectors (Word Embeddings)
Different techniques to represent words as vectors (Word Embeddings)From Count Vectorizer to Word2VecKaran BhanotBlockedUnblockFollowFollowingJun 7Photo by Romain Vignes on UnsplashCurrently, I’m working…
Continue ReadingAction Movies vs Dramas: How do Their Scripts Differ?
Action Movies vs Dramas: How do Their Scripts Differ?An analysis of the differences between action movies and dramas using Python and…
Continue ReadingMy First Usage of Natural Language Processing (NLP) in Industry
The answer was using Natural Language Processing (NLP). Photo by Markus Spiske on UnsplashHow NLP increased the data availableThe business problem…
Continue ReadingRepresenting music with Word2vec?
While training the network, we use sampled pairs, consisting of the input word with a random word from the context…
Continue ReadingProcess synchronization monitors in go
Process synchronization monitors in goAngad SharmaBlockedUnblockFollowFollowingMay 2IntroductionIn the most recent times, programming has taken its fifth gear by leveraging process synchronization…
Continue ReadingThe Mueller Report: An investigation in R
The Mueller Report: An investigation in RAditya MangalBlockedUnblockFollowFollowingMay 5With the recent release of the Mueller Report, I thought it would be…
Continue ReadingA Game of Words
With all the hype surrounding the show at the moment I thought it the perfect time to investigate how Data…
Continue ReadingFeature Extraction from Text (text data preprocessing)
” characters. Text CleaningWe are gonna keep the words and spaces and remove everything else for further feature processing, but…
Continue ReadingBuild it Yourself — Chatbot API with Keras/TensorFlow Model
Build it Yourself — Chatbot API with Keras/TensorFlow ModelStep-by-step solution with source code to build a simple chatbot on top of Keras/TensorFlow…
Continue Reading