Open source software is licensed using open source licenses. There are many of open source licenses around and adding to that, open source software packages involve sometimes multiple licenses for different files.
Atarashi provides different methods for scanning for license statements in open source software. Unlike existing rule-based approaches - such as the Nomos license scanner from the FOSSology project - atarashi implements multiple text statistics and information retrieval algorithms.
Anticipated advantages is an improved precision while offering an as easy as possible approach to add new license texts or new license references.
Atarashi is designed to work stand-alone and with FOSSology. More info at http://fossology.github.io/atarashi
- Python >= v3.5
- pip >= 18.1
pip install atarashi
pip install .
- It will download all dependencies required and trigger build as well.
- Build will generate 3 new files in your current directory
data/Ngram_keywords.json
licenses/<SPDX-version>.csv
licenses/processedList.csv
- These files will be placed to their appropriate places by the install script.
pip install -r requirements.txt
$ python3 setup.py build
Get the help by running atarashi -h
or atarashi --help
-
Running DLD agent
atarashi -a DLD /path/to/file.c
-
Running wordFrequencySimilarity agent
atarashi -a wordFrequencySimilarity /path/to/file.c
-
Running tfidf agent
-
With Cosine similarity
atarashi -a tfidf /path/to/file.c
atarashi -a tfidf -s CosineSim /path/to/file.c
-
With Score similarity
atarashi -a tfidf -s ScoreSim /path/to/file.c
-
-
Running Ngram agent
-
With Cosine similarity
atarashi -a Ngram /path/to/file.c
atarashi -a Ngram -s CosineSim /path/to/file.c
-
With Dice similarity
atarashi -a Ngram -s DiceSim /path/to/file.c
-
With Bigram Cosine similarity
atarashi -a Ngram -s BigramCosineSim /path/to/file.c
-
-
Running in verbose mode
atarashi -a DLD -v /path/to/file.c
-
Running with custom CSVs and JSONs
- Please reffer to the build instructions to get the CSV and JSON understandable by atarashi.
atarashi -a DLD -l /path/to/processedList.csv /path/to/file.c
atarashi -a Ngram -l /path/to/processedList.csv -j /path/to/ngram.json /path/to/file.c
-
Pull Docker image
docker pull fossology/atarashi:latest
-
Run the image
docker run --rm -v <path/to/scan>:/project fossology/atarashi:latest <options> /project/<path/to/file>
Since docker can not access host fs directly, we mount a volume from the
directory containing the files to scan to /project
in the container. Simply
pass the options and path to the file relative to the mounted path.
- Run imtihaan (meaning Exam in Hindi) with the name of the Agent.
- eg.
python atarashi/imtihaan.py /path/to/processedList.csv <DLD|tfidf|Ngram> <testfile>
- See
python atarashi/imtihaan.py --help
for more
- Install dependencies
# apt-get install python3-setuptools python3-all debhelper
# pip install stdeb
- Create Debian packages
$ python3 setup.py --command-packages=stdeb.command bdist_deb
- Locate the files under
deb_dist
SPDX-License-Identifier: GPL-2.0
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
-
Go to project directory 'atarashi'.
-
Install Sphinx and m2r
pip install sphinx m2r
(Since this project is based on python sopip
is already installed). -
Initialise
docs/
directory withsphinx-quickstart
mkdir docs cd docs/ sphinx-quickstart
Root path for the documentation [.]: .
Separate source and build directories (y/n) [n]: n
autodoc: automatically insert docstrings from modules (y/n) [n]: y
intersphinx: link between Sphinx documentation of different projects (y/n) [n]: y
- Else use the default option
-
Setup the
conf.py
and includeREADME.md
-
Enable the following lines and change the insert path:
import os import sys sys.path.insert(0, os.path.abspath('../'))
-
Enable
m2r
to insert.md
files in Sphinx documentation:[...] extensions = [ ... 'm2r', ] [...] source_suffix = ['.rst', '.md']
-
Include
README.md
by editingindex.rst
.. toctree:: [...] readme .. mdinclude:: ../README.md
-
-
Auto-generate the
.rst
files indocs/source
which will be used to generate documentationcd docs/ sphinx-apidoc -o source/ ../atarashi
-
cd docs
-
make html
This will generate file in docs/_build/html
. Go to: index.html
You can change the theme of the documentation by changing html_theme
in config.py file in docs/
folder.
You can choose from {'alabaster', 'classic', 'sphinxdoc', 'scrolls', 'agogo', 'traditional', 'nature', 'haiku', 'pyramid', 'bizstyle'}
Reference