NLP Road Map


  • The relationship among keywords could be interpreted in ambiguous ways since they are represented in the format of a semantic mind-map. Please just focus on KEYWORD in square box, and deem them as the essential parts to learn.
  • The work of containing a plethora of keywords and knowledge within just an image has been challenging. Thus, please note that this roadmap is one of the suggestions or ideas.
  • You are eligible for using the material of your own free will including commercial purpose but highly expected to leave a reference (
Probability and Statistics
Continue reading





Rabin–Karp algorithm

I know it have been a long while that I do not update my website. Even missed the entire May…

Actually, I am confused about my future path during these two months. I got a bunch of offers from different places. However, I don’t even know where should I go ultimately.

So I just try to learn some new stuff as I can to kill the time…

Rabin-Karp is a kind of string searching algorithm which created by Richard M. Karp and Michael O. Rabin. It uses the rolling hash to find an exact match of pattern in a given text. Of course, it is also able to match for multiple patterns.

def search(pattern, text, mod):
    # Let d be the number of characters in the input set
    d = len(set(list(text)))
    # Length of pattern     
    l_p = len(pattern)
    # Length of text
    l_t = len(text)
    p = 0
    t = 0
    h = 1
    # Let us calculate the hash value of the pattern
    # hash value for pattern(p) = Σ(v * dm-1) mod 13 
    #                           = ((3 * 102) + (4 * 101) + (4 * 100)) mod 13 
    #                           = 344 mod 13 
    #                           = 6     
    for i in range(l_p - 1):
        h = (h * d) % mod

    # Calculate hash value for pattern and text
    for i in range(l_p):
        p = (d * p + ord(pattern[i])) % mod
        t = (d * t + ord(text[i])) % mod

    # Find the match
    for i in range(l_t - l_p + 1):
        if p == t:
            for j in range(l_p):
                if text[i+j] != pattern[j]:

            j += 1
            if j == l_p:
                print("Pattern is found at position: " + str(i+1))

        if i < l_t - l_p:
            t = (d*(t-ord(text[i])*h) + ord(text[i+l_p])) % mod

            if t < 0:
                t += mod

pattern = "CDD"
search(pattern, text, 13)

Generate Parentheses

Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.

class Solution:
    def generateParenthesis(self, n: int) -> List[str]:
        ret = []

        # @functools.lru_cache(None)
        def dfs(curr, l, r):
            if l == n and r == n:
            if r > l: return 
            if l < n: dfs(curr + "(", l + 1, r)
            if r < n: dfs(curr + ")", l, r + 1)

        dfs('', 0, 0)

        return ret

Something interesting

不管哪个领域,都可以在上升期做科研,在平稳期做业务,在饱和期做教育,显然Andrew Ng是个明白人。

No matter what field you are in, you can do research in the growing period, dedicate into the industry in the steady period, and develop education in the saturation period. Obviously, Andrew Ng is a sensible person.





To be honest, we have a decent job, house, car at the age of 25, should not complain more. However, cars/houses/good jobs, all of them, are general commodities, others may have them of ten times or even of hundred times than of what I have. But the ten years of youth, everyone has only one time.

So, please remember this thing, when you are 26 years old and decide whether you want to be a 30-year-old Doctor.

Machine learning, can?


Actually, we have been aware of the trending of massive scale parameters in machine learning. The number of CPUs and the access of data determines the final performance, but not the person who researches machine learning algorithms.