You should probably copy Twitter's own twitter-text library. Their regex is a bit more complicated, and in Ruby, an I am not proficient enough in Python to push a real patch:
LATIN_ACCENTS = [(0xc0..0xd6).to_a, (0xd8..0xf6).to_a, (0xf8..0xff).to_a].flatten.pack('U*').freeze HASHTAG_CHARACTERS = /[a-z0-9_#{LATIN_ACCENTS}]/io
However, HASHTAG_CHARACTERS are only allowed from position 2 and on:
REGEXEN[:auto_link_hashtags] = /(^|[^0-9A-Z&\/]+)(#|#)([0-9A-Z_]*[A-Z_]+#{HASHTAG_CHARACTERS}*)/io
You should probably copy Twitter's own twitter-text library. Their regex is a bit more complicated, and in Ruby, an I am not proficient enough in Python to push a real patch:
LATIN_ACCENTS = [(0xc0..0xd6).to_a, (0xd8..0xf6).to_a, (0xf8.. 0xff).to_ a].flatten. pack('U* ').freeze 9_#{LATIN_ ACCENTS} ]/io
HASHTAG_CHARACTERS = /[a-z0-
However, HASHTAG_CHARACTERS are only allowed from position 2 and on:
REGEXEN[ :auto_link_ hashtags] = /(^|[^0- 9A-Z&\/ ]+)(#|# )([0-9A- Z_]*[A- Z_]+#{HASHTAG_ CHARACTERS} *)/io