Package pypln :: Package workers :: Module tokenizer
[hide private]

Source Code for Module pypln.workers.tokenizer

 1  # coding: utf-8 
 2   
 3  from nltk import word_tokenize 
 4   
 5   
 6  __meta__ = {'from': 'document', 
 7              'requires': ['text'], 
 8              'to': 'document', 
 9              'provides': ['tokens'],} 
10   
11 -def main(document):
12 text = document['text'] 13 result = word_tokenize(text) 14 return {'tokens': result}
15