Development¶
Setup quickstart¶
Install required software: Docker, docker-compose (1.10+), make, and git.
Linux:
Use your package manager.
OSX:
Install Docker for Mac which will install Docker and docker-compose.
Use homebrew to install make and git:
$ brew install make git
Other:
Install Docker.
Install docker-compose. You need something higher than 1.10, but less than 2.0.0.
Install make.
Install git.
Clone the repository so you have a copy on your host machine.
Instructions for cloning are on the Tecken page in GitHub.
(Optional for Linux users) Set UID and GID for Docker container user.
If you’re on Linux or you want to set the UID/GID of the app user that runs in the Docker containers, run:
$ make .env
Then edit the file and set the
APP_UID
andAPP_GID
variables. These will get used when creating the app user in the base image.If you ever want different values, change them in
.env
and re-runmake build
.Build Docker images for Socorro services.
From the root of this repository, run:
$ make build
That will build the app Docker image required for development.
Initialize Postgres and S3 (localstack).
Run:
$ make setup
This creates the Postgres database and sets up tables, integrity rules, and a bunch of other things.
For S3, this creates the required buckets.
Tecken consists of:
a Symbols Service webapp that covers uploading and downloading symbols
a Symbolication Service webapp (Eliot) that covers symbolication
To run these two services, do:
$ make run
The Symbols Service webapp is at: http://localhost:3000
The Symbolication Service webapp is at: http://localhost:8050
Bugs / Issues¶
All bugs are tracked in Bugzilla.
Write up a new bug:
https://bugzilla.mozilla.org/enter_bug.cgi?product=Tecken&component=General
If you want to do work for which there is no bug, it’s best to write up a bug first. Maybe the ensuing conversation can save you the time and trouble of making changes!
Code workflow¶
Bugs¶
Either write up a bug or find a bug to work on.
Assign the bug to yourself.
Work out any questions about the problem, the approach to fix it, and any additional details by posting comments in the bug.
Pull requests¶
Pull request summary should indicate the bug the pull request addresses. For example:
bug nnnnnnn: removed frob from tree class
Pull request descriptions should cover at least some of the following:
what is the issue the pull request is addressing?
why does this pull request fix the issue?
how should a reviewer review the pull request?
what did you do to test the changes?
any steps-to-reproduce for the reviewer to use to test the changes
After creating a pull request, attach the pull request to the relevant bugs.
We use the rob-bugson Firefox addon. If the pull request has “bug nnnnnnn: …” in the summary, then rob-bugson will see that and create a “Attach this PR to bug …” link.
Then ask someone to review the pull request. If you don’t know who to ask, look at other pull requests to see who’s currently reviewing things.
Code reviews¶
Pull requests should be reviewed before merging.
Style nits should be covered by linting as much as possible.
Code reviews should review the changes in the context of the rest of the system.
Landing code¶
Once the code has been reviewed and all tasks in CI pass, the pull request author should merge the code.
This makes it easier for the author to coordinate landing the changes with other things that need to happen like landing changes in another repository, data migrations, configuration changes, and so on.
We use “Rebase and merge” in GitHub.
Conventions¶
Python code conventions¶
All Python code files should have an MPL v2 header at the top:
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
We use black to reformat Python code and we use prettier to reformat JS code.
To lint all the code, do:
$ make lint
To reformat all the code, do:
$ make lintfix
HTML/CSS conventions¶
2-space indentation.
Javascript code conventions¶
2-space indentation.
All JavaScript code files should have an MPL v2 header at the top:
/*
* This Source Code Form is subject to the terms of the Mozilla Public
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/.
*/
Git conventions¶
First line is a summary of the commit. It should start with:
bug nnnnnnn: summary
After that, the commit should explain why the changes are being made and any notes that future readers should know for context or be aware of.
Managing dependencies¶
Python dependencies¶
Python dependencies are maintained in the requirements.in
file and
“compiled” with hashes and dependencies of dependencies in the
requirements.txt
file.
To add a new dependency, add it to the file and then do:
$ make rebuildreqs
Then rebuild your docker environment:
$ make build
If there are problems, it’ll tell you.
In some cases, you might want to update the primary and all the secondary dependencies. To do this, run:
$ make updatereqs
JavaScript dependencies (Symbols Service)¶
Tecken uses yarn for JavaScript dependencies. Use the
yarn
installed in the Docker frontend container:
$ docker-compose run frontend bash
# display packages that can be upgraded
node@xxx:/app$ yarn outdated
# example of upgrading an existing package
node@xxx:/app$ yarn upgrade date-fns --latest
# example of adding a new package
node@xxx:/app$ yarn add some-new-package
When you’re done, you have to rebuild the frontend Docker container:
$ docker-compose build frontend
Your change should result in changes to frontend/package.json
and
frontend/yarn.lock
which needs to both be checked in and committed.
Documentation¶
Documentation for Tecken is build with Sphinx and is available on ReadTheDocs.
To build the docs, do:
$ make docs
Then view docs/_build/html/index.html
in your browser.
Testing¶
Unit tests¶
Tecken webapp and Eliot both have Python unit tests that use the pytest test framework.
To run all of the unit tests, do:
$ make test
See Python tests for Symbols Service webapp and Python tests for Symbolication Service for details.
System tests¶
System tests are located in the repository in systemtests/
. See the
README.rst
there for usage.
System tests can be run against any running environment: local, stage, or prod.
Load tests¶
At various points, we’ve done some minor load testing of the system. The scripts are located in:
https://github.com/mozilla-services/tecken-loadtests/
They’re good for bootstrapping another load testing effort, but they’re not otherwise maintained.
Symbols Service webapp things¶
When running the Tecken webapp in the local dev environment, it’s at: http://localhost:3000
The code is in tecken/
.
You can override Symbols Service webapp configuration in your .env
file.
To log in, do this:
Click “Sign In” to start an OpenID Connect session on
oidcprovider
Click “Sign up” to create an
oidcprovider
account:
Username: A non-email username, like
username
Email: Your email address
Password: Any password, like
password
Click “Authorize” to authorize Tecken to use your
oidcprovider
accountYou are returned to http://localhost:3000. If needed, a parallel Tecken User will be created, with default permissions and identified by email address.
You’ll remain logged in to oidcprovider
, and the account will persist until
the oidcprovider
container is stopped.
You can visit http://oidc.127.0.0.1.nip.io:8081/account/logout to manually log
out.
Python tests for Symbols Service webapp¶
To run the tests, do:
$ make test
Tests for the Symbols Service webapp go in tecken/tests/
.
If you need to run specific tests or pass in different arguments, you can use the testshell:
$ make testshell
app@xxx:/app$ pytest
<pytest output>
app@xxx:/app$ cd tecken/
app@xxx:/app/tecken$ pytest tests/test_download.py
<pytest output>
JavaScript tests¶
The Tecken webapp is built using JavaScript and React. There are no tests for this code and it has to be tested manually. You can do something like this:
go to Tecken webapp website
wait for front page to load
click on “Home”
click on “Downloads missing”
click on “Symbolication”
click on “Help”
click on “Log in” and log in
click on “Home”
click on “Downloads missing”
click on “User management”
click on “API tokens”
click on “Uploads”
click on “Symbolication”
click on “Help”
click on “Sign out”
Database migrations¶
The Symbols Service webapp uses Django’s ORM and thus we do database migrations using Django’s migration system.
Do this:
$ make shell
app@xxx:/app$ ./manage.py makemigration --name "BUGID_desc" APP
Accounts and first superuser¶
The Symbols Service webapp has an accounts system.
Users need to create their own API tokens but before they can do that they need to be promoted to have that permission at all.
The only person/people who can give other users permissions is the superuser. To bootstrap the user administration you need to create at least one superuser. That superuser can promote other users to superusers too.
This action does NOT require that the user signs in at least once. If the user does not exist, it gets created.
The easiest way to create your first superuser is to use docker-compose
:
$ docker-compose run --rm web bash python manage.py superuser yourname@example.com
Additionally, in a local development environment, you can create a corresponding user in the oidcprovider service like this:
$ docker-compose exec oidcprovider /code/manage.py createuser yourname yourpassword yourname@example.com
Giving users permission to upload symbols¶
The user should write up a bug. See Basics.
If the user is a Mozilla employee, needinfo the user’s manager and verify the user needs upload permission.
If the user is not a Mozilla employee, find someone to vouch for the user.
Once vouched:
Log in to https://symbols.mozilla.org/users
Use the search filter at the bottom of the page to find the user
Click to edit and make give them the “Uploaders” group (only).
Respond and say that they now have permission and should be able to either upload via the web or create an API Token with the “Upload Symbol Files” permission.
Resolve the bug.
Viewing all metrics keys¶
In the Symbols Service webapp, to get insight into all metrics keys that are
used, a special Markus backend is enabled called
tecken.libmarkus.LogAllMetricsKeys
. It’s enabled by default in local
development. And to inspect its content you can either open
all-metrics-keys.json
directly (it’s git ignored) or you can run:
$ make shell
app@xxx:/app$ ./bin/list-all-metrics-keys.py
Now you can see a list of all keys that are used. Take this and, for example, make sure you make a graph in Datadog of each and everyone. If there’s a key in there that you know you don’t need or care about in Datadog, then delete it from the code.
The file all-metrics-keys.json
can be deleted any time and it will be
recreated again.
Localstack (S3 mock server)¶
When doing local development we, by default, mock AWS S3 and instead use localstack. It’s an S3 emulator.
When started with docker, it starts a web server on :4566
that you can
use to browse uploaded files. Go to http://localhost:4566
.
How to do local Upload by Download URL¶
When doing local development and you want to work on doing Symbol Upload by HTTP posting the URL, you have a choice. Either put files somewhere on a public network, or serve the locally.
Before we start doing local Upload By Download URL, you need to make your
instance less secure since you’ll be using URLs like http://localhost:9090
.
Add DJANGO_ALLOW_UPLOAD_BY_ANY_DOMAIN=True
to your .env
file.
To serve them locally, first start the dev server (make run
). Then
you need to start a bash shell in the current running web container:
$ make shell
Now, you need some .zip
files in the root of the project since it’s
mounted and can be seen by the containers. Once they’re there, start a
simple Python server:
$ ls -lh *.zip
$ python -m http.server --bind 0.0.0.0 9090
Now, you can send these in with tecken-loadtest
like this:
$ export AUTH_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxx
$ python upload-symbol-zips.py http://localhost:8000 -t 160 --download-url=http://localhost:9090/symbols.zip
This way you’ll have 3 terminals. 2 bash terminals inside the container
and one outside in the tecke-loadtests
directory on your host.
Debugging a “broken” Redis¶
By default, we have our Redis Cache configured to swallow all exceptions (…and just log them). This is useful because the Redis Cache is only supposed to make things faster. It shouldn’t block things from working even if that comes at a price of working slower.
To simulate that Redis is “struggling” you can use the CLIENT PAUSE command. For example:
$ make redis-cache-cli
redis-cache:6379> client pause 30000
OK
Now, for 30 seconds (30,000 milliseconds) all attempts to talk to Redis Cache
is going to cause a redis.exceptions.TimeoutError: Timeout reading from socket
exception which gets swallowed and logged. But you should be able to use
the service fully.
For example, all things related to authentication, such as your session cookie
should continue to work because we use the cached_db
backend in
settings.SESSION_ENGINE
. It just means we have to rely on PostgreSQL to
verify the session cookie value on each and every request.
Auth debugging¶
Cache/cookeis issues¶
Anyone can test caching and cookies by going to https://symbols.mozilla.org/__auth_debug__. That’s a good first debugging step for helping users figure out auth problems.
Auth0 issues¶
Symbols Service uses Mozilla SSO. Anyone can log in, but by default accounts don’t have special permissions to anything.
A potential pattern is that a user logs in with their work email
(e.g. example@mozilla.com
), gets permissions to create API tokens,
the uses the API tokens in a script and later leaves the company whose
email she used she can no longer sign in to again. If this happens
her API token should cease to work, because it was created based on the
understanding that she was an employee and has access to the email address.
This is why there’s a piece of middleware that periodically checks that users who once authenticated with Auth0 still is there and not blocked.
Being “blocked” in Auth0 is what happens, “internally”, if a user is removed from LDAP/Workday and Auth0 is informed. There could be other reasons why a user is blocked in Auth0. Whatever the reasons, users who are blocked immediately become inactive and logged out if they’re logged in.
If it was an error, the user can try to log in again and if that works, the user becomes active again.
This check is done (at the time of writing) max. every 24 hours. Meaning,
if you managed to sign or use an API token, you have 24 hours to use this
cookie/API token till your user account is checked again in Auth0. To
override this interval change the environment variable
DJANGO_NOT_BLOCKED_IN_AUTH0_INTERVAL_SECONDS
.
Testing if a user is blocked¶
To check if a user is blocked, use the is-blocked-in-auth0
which is
development tool shortcut for what the middleware does:
$ docker-compose run web python manage.py is-blocked-in-auth0 me@example.com
Symbolication Service webapp things (Eliot)¶
How Symbolication Service works¶
When running Symbolication Service webapp in the local dev environment, it’s at: http://localhost:8050
The code is in eliot-service/
.
Symbolication Service webapp logs its configuration at startup. You can
override any of those configuration settings in your .env
file.
Symbolication Service webapp runs in a Docker container and is composed of:
Honcho process which manages:
eliot_web: gunicorn which runs multiple worker webapp processes
eliot_disk_manager: a disk cache manager process
Symbolication Service webapp handles HTTP requests by pulling sym files from
the urls configured by ELIOT_SYMBOL_URLS
. By default, that’s
https://symbols.mozilla.org/try
.
The Symbolication Service webapp downloads sym files, parses them into symcache
files, and performs symbol lookups with the symcache files. Parsing sym files
and generating symcache files takes a long time, so it stores the symcache
files in a disk cache shared by all webapp processes running in that Docker
container. The disk cache manager process deletes least recently used items
from the disk cache to keep it under ELIOT_SYMBOLS_CACHE_MAX_SIZE
bytes.
Metrics¶
Table of metrics:
Key |
Type |
---|---|
timing |
|
incr |
|
incr |
|
histogram |
|
incr |
|
timing |
|
histogram |
|
histogram |
|
histogram |
|
histogram |
|
histogram |
|
incr |
|
gauge |
|
incr |
Metrics details:
- eliot.symbolicate.api¶
Type: timing
Timer for long a symbolication API request takes to handle.
Tags:
version
: the symbolication api versionv4
: the v4 APIv5
: the v5 API
- eliot.symbolicate.proxied¶
Type: incr
Counter for symbolication requests tagged by whether they were proxied or not.
Tags:
proxied
: “1” if proxied, “0” if not
- eliot.symbolicate.request_error¶
Type: incr
Counter for errors in incoming symbolication requests.
Tags:
reason
: the error reasonbad_json
: the payload is not valid JSONinvalid_modules
: the payload has invalid modulesinvalid_stacks
: the payload has invalid stackstoo_many_jobs
: (v5) the payload has too many jobs in it
- eliot.downloader.download¶
Type: histogram
Timer for how long it takes to download SYM files.
Tags:
response
: the HTTP response we got backsuccess
: HTTP 200fail
: HTTP 404, 500, etc
- eliot.symbolicate.parse_sym_file.error¶
Type: incr
Counter for when a sym file fails to parse.
Tags:
reason
: the reason it failed to parsebad_debug_id
: debug_id is not validsym_debug_id_lookup_error
: when the debug_id isn’t in the sym filesym_tmp_file_error
: error creating tmp file to save the sym file to disk
- eliot.symbolicate.parse_sym_file.parse¶
Type: timing
Timer for how long it takes to parse sym files with Symbolic.
- eliot.symbolicate.jobs_count¶
Type: histogram
Histogram for how many jobs were in the symbolication request.
Tags:
version
: the symbolication api versionv4
: the v4 APIv5
: the v5 API
- eliot.symbolicate.stacks_count¶
Type: histogram
Histogram for how many stacks per job were in the symbolication request.
Tags:
version
: the symbolication api versionv4
: the v4 APIv5
: the v5 API
- eliot.symbolicate.frames_count¶
Type: histogram
Histogram for how many frames per stack were in the symbolication request.
- eliot.diskcache.get¶
Type: histogram
Timer for how long it takes to get symcache files from the disk cache.
Tags:
result
: the cache resulthit
: the file was in cacheerror
: the file was in cache, but there was an error reading itmiss
: the file was not in cache
- eliot.diskcache.set¶
Type: histogram
Timer for how long it takes to save a symcache file to the disk cache.
Tags:
result
: the cache resultsuccess
: the file was saved successfullyfail
: the file was not saved successfully
- eliot.diskcache.evict¶
Type: incr
Counter for disk cache evictions.
- eliot.diskcache.usage¶
Type: gauge
Gauge for how much of the cache is in use.
- eliot.sentry_scrub_error¶
Type: incr
Emitted when there are errors scrubbing Sentry events. Monitor these because it means we’re missing Sentry event data.
Python tests for Symbolication Service¶
To run all the tests, do:
$ make test
Tests for the Symbolication Service webapp go in eliot-service/tests/
.
If you need to run specific tests or pass in different arguments, you can use the testshell:
$ make testshell
app@xxx:/app$ cd eliot-service
app@xxx:/app/eliot-service$ pytest
<pytest output>
app@xxx:/app/eliot-service$ pytest tests/test_app.py
<pytest output>