New port: textproc/py-docx2txt: Pure python-based utility to extract text and images from docx files

svn path=/head/; revision=472429
2018-06-15 07:09:59 +00:00 · 2018-06-15 07:09:59 +00:00 · 4587964b4c · 2021-03-31 03:12:20 +00:00
commit 4587964b4c
parent 9ea02e4dd8
4 changed files with 27 additions and 0 deletions
--- a/textproc/Makefile
+++ b/textproc/Makefile
@ -1288,6 +1288,7 @@
    SUBDIR += py-dbfread
    SUBDIR += py-diff-match-patch
    SUBDIR += py-docutils
+    SUBDIR += py-docx2txt
    SUBDIR += py-dsv
    SUBDIR += py-duecredit
    SUBDIR += py-elasticsearch
--- a/textproc/py-docx2txt/Makefile
+++ b/textproc/py-docx2txt/Makefile
@ -0,0 +1,19 @@
+# $FreeBSD$
+
+PORTNAME=	docx2txt
+DISTVERSION=	0.7
+CATEGORIES=	textproc python
+MASTER_SITES=	CHEESESHOP
+PKGNAMEPREFIX=	${PYTHON_PKGNAMEPREFIX}
+
+MAINTAINER=	yuri@FreeBSD.org
+COMMENT=	Pure python-based utility to extract text and images from docx files
+
+LICENSE=	MIT
+LICENSE_FILE=	${WRKSRC}/LICENSE.txt
+
+USES=		python
+USE_PYTHON=	distutils concurrent autoplist
+NO_ARCH=	yes
+
+.include <bsd.port.mk>
--- a/textproc/py-docx2txt/distinfo
+++ b/textproc/py-docx2txt/distinfo
@ -0,0 +1,3 @@
+TIMESTAMP = 1529046338
+SHA256 (docx2txt-0.7.tar.gz) = 335363e5eb827dde2838fae69f5032b9a5c00f311def4022c196424bce697f0f
+SIZE (docx2txt-0.7.tar.gz) = 2781
--- a/textproc/py-docx2txt/pkg-descr
+++ b/textproc/py-docx2txt/pkg-descr
@ -0,0 +1,4 @@
+The code is adapted from python-docx. It can however also extract text from
+header, footer and hyperlinks. It can now also extract images.
+
+WWW: https://github.com/ankushshah89/python-docx2txt