339 Open Source Parse Software Projects
Free and open source parse code projects including engines, APIs, generators, and tools.
Antlr4 8583 ⭐
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
Remarkable 4891 ⭐
Markdown parser, done right. Commonmark support, extensions, syntax plugins, high speed - all in one. Gulp and metalsmith plugins available. Used by Facebook, Docusaurus and many others! Use https://github.com/breakdance/breakdance for HTML-to-markdown conversion. Use https://github.com/jonschlinkert/markdown-toc to generate a table of contents.
Swiftsoup 2433 ⭐
SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)
Lark Parser Lark 2052 ⭐
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
Node CsvtoJSON 1547 ⭐
Blazing fast and Comprehensive CSV Parser for Node.JS / Browser / Command Line.
Gray Matter 1056 ⭐
Smarter YAML front matter parser, used by metalsmith, Gatsby, Netlify, Assemble, mapbox-gl, phenomic, and many others. Simple to use, and battle tested. Parses YAML by default but can also parse JSON Front Matter, Coffee Front Matter, TOML Front Matter, and has support for custom parsers.
Tika Python 926 ⭐
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Chatistics 787 ⭐
💬 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.
Parsepy 503 ⭐
A relatively up-to-date fork of ParsePy, the Python wrapper for the Parse.com API. Originally maintained by @dgrtwo
Probablepeople 409 ⭐
:family: a python library for parsing unstructured western names into name components.
Breakdance 407 ⭐
It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.
Micromark 477 ⭐
the smallest commonmark compliant markdown parser that exists; new basis for @unifiedjs (hundreds of projects w/ billions of downloads for dealing w/ content)
Nlp Cube 331 ⭐
Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
Swiftpascalinterpreter 268 ⭐
Simple Swift interpreter for the Pascal language inspired by the Let’s Build A Simple Interpreter article series.
Pubmed_parser 231 ⭐
:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset
Termsql 222 ⭐
Convert text from a file or from stdin into SQL table and query it instantly. Uses sqlite as backend. The idea is to make SQL into a tool on the command line or in scripts.
Sailormoon Flags 174 ⭐
⛳ Simple, extensible, header-only C++17 argument parser released into the public domain.
Snapdragon 174 ⭐
snapdragon is an extremely pluggable, powerful and easy-to-use parser-renderer factory.
Uap Ruby 169 ⭐
A simple, comprehensive Ruby gem for parsing user agent strings with the help of BrowserScope's UA database
Pegparser 145 ⭐
💡 Build your own programming language! A C++17 PEG parser generator supporting parser combination, memoization, left-recursion and context-dependent grammars.
Olefile 133 ⭐
olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.
Skrape.it 144 ⭐
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Cppalliance JSon 153 ⭐
A C++11 or library for parsing and serializing JSON to and from a DOM container in memory.
Python Benedict 138 ⭐
dict subclass with keylist/keypath support, I/O shortcuts (base64, csv, json, pickle, plist, query-string, toml, xml, yaml) and many utilities. :blue_book:
Logparser 100 ⭐
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Dotenv Parse Variables 93 ⭐
Parse dotenv files for Boolean, Array, and Number variable types, built for Lad
Gonids 93 ⭐
gonids is a library to parse IDS rules, with a focus primarily on Suricata rule compatibility. There is a discussion forum available that you can join on Google Groups: https://groups.google.com/forum/#!topic/gonids/
Parse Github Url 88 ⭐
Parse a Github URL into an object. Supports a wide variety of GitHub URL formats.