Suppose you type a little text into a text file, say “123”. If you open this file in a hex…
Continue Readingcharacter
What is Tokenization in NLP? Here’s All You Need To Know
Highlights Tokenization is a key (and mandatory) aspect of working with text data We’ll discuss the various nuances of tokenization,…
Continue ReadingWorking with strings in JavaScript
Working with strings in JavaScriptNikhil SwainBlockedUnblockFollowFollowingMay 15HTML is a text-based, so if you’re reading or writing data on a web…
Continue ReadingLinguistics in Tech: Unicode
It’s the same governing body to standardize JS in the 1990’s, making the official name ECMAScript. QuoraEnter UnicodeUnicode is an…
Continue ReadingWhen to Use Python Object-Oriented Programming
There is a very important special method on the object class that we can take advantage of to represent our…
Continue ReadingSome properties of ASCII characters
Some properties of ASCII charactersAnthony AbeoBlockedUnblockFollowFollowingApr 20When writing programs that deal with characters and strings, some of the methods programmers…
Continue ReadingUsing Regular Expression in Genetics with Python
finds the preceding character or character group zero or one times. If it is a requirement to be specific or…
Continue ReadingRegExes in Ruby — A Brief Summary
Similar to a hash, we can assign names to the things we’ve matched. For this, we use the syntax /(?<key>regex)/…
Continue Reading4 Simple steps in building OCR
4 Simple steps in building OCRNaga KiranBlockedUnblockFollowFollowingMar 14Optical character recognition (OCR) is process of classifying optical patterns contained in a digital…
Continue ReadingRegex: The Good, the Bad and the Basics
However, this is only a very simple Regex which in pseudo-English states [start of Regex][one or more non-word character][end of…
Continue ReadingData Cleaning, Detection and Imputation of Missing Values in R Markdown — Part 2
Data Cleaning, Detection and Imputation of Missing Values in R Markdown — Part 2Wendy WongBlockedUnblockFollowFollowingJan 11Data cleaning and transforming variables in R using…
Continue ReadingUnderstanding Swift’s CharacterSet
This is precisely the reason why NSCharacterSet.characterIsMember(UTF8 or UTF16 or UTF32) internally calls longCharacterIsMember(UTF32) which only accepts a UTF32 character.³CharacterSet…
Continue Reading