PyPLN - Distributed Natural Language Processing, with Python

What PyPLN is

PyPLN is a platform for processing and extracting useful information from text. It was conceived to run in the cloud, scale quickly and be easy to use. It integrates many text mining and natural language processing tools, which can be acessed via an easy-to-use Web interface, where you can manage documents, corpora and interact with its analysis/visualizations.

As its main feature, you can visualize analysis like part-of-speech tags, word frequency statistics and other useful information. It also offers a full-text search on your corpora so you can easily find information and then visualize its analysis.

PyPLN is developed by a research group called Núcleo de Análise e Modelagem de Dados (aka NAMD) located on Applied Mathematics School at Fundação Getúlio Vargas (in Rio de Janeiro, Brazil).

For more information, please check the project documentation.

Technology

Developers

Álvaro Justen

GitHub turicas
Twitter @turicas
Site turicas.info

Flávio Amieiro

Flávio Coelho

GitHub fccoelho
Twitter @fccoelho
Site fccoelho.github.com

Renato Souza

GitHub rsouza
Twitter @rrsouza
Site EMAp/Renato Souza

Try our Demo

Do you want to try PyPLN without needing to install it? Register a username and start using it free (as in free beer) right now:

fgv.pypln.org

Note that this is just a demo installation and sometimes we need to migrate or delete data for testing new features (the project is in active development). So, do not rely on this demo to store your documents and create analysis on top of it (we advise you to install a instance in your own infrastructure).

Interact with the Community

One of the most valuable characteristics in free/libre software projects is the collaboration that comes from the community. Join our discussions, suggest new features and stay in touch through:

Get the Code

PyPLN is free (as in free speech) software. You can download, submit bug reports and contribute through GitHub. Feel free to fork our repositories and submit pull requests:

If you have skills on programming, linguistics or design and want to help this project, please see our contributing guidelines.

Sponsor

Our work is sponsored by Fundação Getúlio Vargas (a brazilian think-tank university) and its Applied Mathematics School.

Team

We have a multidisciplinary team focused on creating the best free/libre platform for natural language processing: there are engineers, computer programmers, mathematicians and linguists working together in this project.

Are you using it?

If you are using PyPLN in your institution please, tell us more about your experience! Join our mail list, share your story and give us feedback so we can keep improving.

Distributed by definition

File-type and language detection

Embedded full-text search

Rich visualizations of text analysis