Adding py-langid, a standalone Language Identification (LangID) tool.

This commit is contained in:
Thierry Thomas 2020-12-01 13:21:06 +00:00
parent c1a2fdd340
commit 181dac2f08
Notes: svn2git 2021-03-31 03:12:20 +00:00
svn path=/head/; revision=556731
5 changed files with 77 additions and 0 deletions

View File

@ -1297,6 +1297,7 @@
SUBDIR += py-jtextfsm
SUBDIR += py-junit-xml
SUBDIR += py-langdetect
SUBDIR += py-langid
SUBDIR += py-laserhammer
SUBDIR += py-libxml2
SUBDIR += py-license-expression

View File

@ -0,0 +1,33 @@
# Created by: Thierry Thomas <thierry@pompo.net>
# $FreeBSD$
PORTNAME= langid
DISTVERSION= 1.1.6-20170715
CATEGORIES= textproc devel python
PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX}
MAINTAINER= thierry@FreeBSD.org
COMMENT= Standalone Language Identification (LangID) tool
LICENSE= BSD2CLAUSE
LICENSE_FILE= ${WRKSRC}/LICENSE
BUILD_DEPENDS= ${PYNUMPY}
RUN_DEPENDS= ${PYNUMPY}
USE_GITHUB= yes
GH_ACCOUNT= saffsd
GH_PROJECT= ${PORTNAME}.py
GH_TAGNAME= 4153583
USES= python:3.6+ shebangfix
USE_PYTHON= distutils
SHEBANG_GLOB= *.py
NO_ARCH= yes
post-extract:
${MKDIR} ${WRKDIR}/unsupported-Python-2.7
${MV} ${WRKSRC}/langid/train ${WRKDIR}/unsupported-Python-2.7
.include <bsd.port.mk>

View File

@ -0,0 +1,3 @@
TIMESTAMP = 1606746121
SHA256 (saffsd-langid.py-1.1.6-20170715-4153583_GH0.tar.gz) = 04b005bd607fcf54f9b06b41a20f968fcb5bdd7d96ec5471177c88ef858ffc9d
SIZE (saffsd-langid.py-1.1.6-20170715-4153583_GH0.tar.gz) = 1959856

View File

@ -0,0 +1,17 @@
langid.py is a standalone Language Identification (LangID) tool.
The design principles are as follows:
Fast
Pre-trained over a large number of languages (currently 97)
Not sensitive to domain-specific features (e.g. HTML/XML markup)
Single .py file with minimal dependencies
Deployable as a web service
Remark: the main script langid/langid.py is cross-compatible with both Python2
and Python3, but the accompanying training tools are still Python2-only, hence
not installed by this port.
See also the port textproc/py-langdetect for a similar program.
WWW: https://github.com/saffsd/langid.py

View File

@ -0,0 +1,23 @@
bin/langid
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/PKG-INFO
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/SOURCES.txt
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/dependency_links.txt
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/entry_points.txt
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/not-zip-safe
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/requires.txt
%%PYTHON_SITELIBDIR%%/langid-1.1.6-py%%PYTHON_VER%%.egg-info/top_level.txt
%%PYTHON_SITELIBDIR%%/langid/__init__.py
%%PYTHON_SITELIBDIR%%/langid/__pycache__/__init__.cpython-%%PYTHON_SUFFIX%%.opt-1.pyc
%%PYTHON_SITELIBDIR%%/langid/__pycache__/__init__.cpython-%%PYTHON_SUFFIX%%.pyc
%%PYTHON_SITELIBDIR%%/langid/__pycache__/langid.cpython-%%PYTHON_SUFFIX%%.opt-1.pyc
%%PYTHON_SITELIBDIR%%/langid/__pycache__/langid.cpython-%%PYTHON_SUFFIX%%.pyc
%%PYTHON_SITELIBDIR%%/langid/langid.py
%%PYTHON_SITELIBDIR%%/langid/tools/__init__.py
%%PYTHON_SITELIBDIR%%/langid/tools/__pycache__/__init__.cpython-%%PYTHON_SUFFIX%%.opt-1.pyc
%%PYTHON_SITELIBDIR%%/langid/tools/__pycache__/__init__.cpython-%%PYTHON_SUFFIX%%.pyc
%%PYTHON_SITELIBDIR%%/langid/tools/__pycache__/featWeights.cpython-%%PYTHON_SUFFIX%%.opt-1.pyc
%%PYTHON_SITELIBDIR%%/langid/tools/__pycache__/featWeights.cpython-%%PYTHON_SUFFIX%%.pyc
%%PYTHON_SITELIBDIR%%/langid/tools/__pycache__/printfeats.cpython-%%PYTHON_SUFFIX%%.opt-1.pyc
%%PYTHON_SITELIBDIR%%/langid/tools/__pycache__/printfeats.cpython-%%PYTHON_SUFFIX%%.pyc
%%PYTHON_SITELIBDIR%%/langid/tools/featWeights.py
%%PYTHON_SITELIBDIR%%/langid/tools/printfeats.py