Testing angular 2 apps: Part 1 – Introduction

In this series, I’m going to show you some tips about writing unit tests in Angular 2 with Jasmine framework. Right now, it’s hard to find up-to-date recipes to write unit tests for Angular 2 (version 2.4 exactly). But, I hope that this series will provide you with some materials to get you going.

1. Angular testing module

Angular 2 requires you to import your testing methods from @angular/testing, including Jasmine functions aka describe, beforeEach, it and so on.

import {
} from '@angular/core/testing';

But this is not something that you had to memorize. In fact, if you use angular-cli to seed your project which I recommend. You’ll get always the minimal code for unit testing for the unit under test (component or service, etc.). If you run the following command

$ ng generate service user-profile

angular-cli will generate two files for you:user-profile.service.ts and user-profile.service.spec.ts. The .spec file will look like:

import { TestBed, async, inject } from '@angular/core/testing';
import { UserProfileService } from './user-profile.service';

describe('UserProfileService', () => {
   beforeEach(() =>; {
       providers: [UserProfileService]

   it('should ...', inject([UserProfileService], (service: UserProfileService) => {

So what we have here ? We imported some testing utilities from Angular 2, three exactly:

  1. TestBed: which is very similar to NgModule, as it helps to setup dependencies for the defined tests. This dependencies should be passed to .configureTestingModule() and it’ll be used to resolve any dependency.
  2. async: In fact, we need to use async when dependencies involve asynchronous processing (XHR calls)
  3. inject: allow us to inject dependencies at the TestBed level

2. Dependency Injection

As mentioned earlier, We use TestBed to declare dependencies for out test suites. We had to use inject to get dependencies at TestBed level.

   it('should ...', inject([Service], (service: Service) => {

If you need to get the dependency at component level, you need to use the
component injector which is a property of fixture’s DebugElement.

describe('MyComponent', () => {
   beforeEach(() => {
       declarations: [ MyComponent ],
       providers: [ MyService ]
it('should ...', () => {
    let fixture = TestBed.createComponent(MyComponent);
    let service = fixture.debugElement.injector.get(MyService);


WARNING: DON’T CREATE THE SERVICE USING THE CONSTRUCTOR. Don’t think even about it. This is neither safe nor maintainable.

Now, we can refactor the code using Jasmine’s beforeEach function so we don’t repeat this snippet for each spec.

describe('MyComponent', () => {

   let component: MyComponent,
       service: MyService,
       fixture: ComponentFixture<MyComponent>;</pre>

   beforeEach(() => {
       declarations: [ MyComponent ],
       providers: [ MyService ]

   beforeEach(() => {
     fixture = TestBed.createComponent(MyComponent);
     component = fixture.componentInstance;
     service = fixture.debugElement.injector.get(MyService);

   it('should ...', () => {

3. Testing coverage

Generally we need to test everything in our application including Pipes, Services, Components and custom classes. Our tests should cover logic written with TypeScript, generated DOM, emitted events, etc.

Hopefully, angular-cli provides a rich test command. In fact, it provide the –cov parameter,  which will generate a test coverage report for your project. Furthemore, if you use a CI server, you can display a nice report using Cobertura plugin for example. You can find below an example:


Thanks for reading! Have any questions? Ping me at @benzid_wael. In the next posts, I’ll show you how to write tests for services, components, etc.



Personally, I think that the most awesome Microsoft innovation in the web is the TypeScript language. TypeScript is an interesting  evolution of JavaScript. Indeed, TypeScript is for JavaScript much like what C++ was for C back then. It defines a superset and powerful facilities for the JavaScript language, which make developer life easier.

Installing TypeScript on Linux

Nothing special here, TypeScript is delivered as a Node.js module

sudo npm install -g typescript

Why TypeScript ?

TypeScript is a strongly-types superset of JavaScript. In other words, it improve the syntax of your JavaScript code.  TypeScript encourages declarative programming which we miss with JavaScript which we’ll discuss further in the rest of this tutorial.

In the other hand, Angular 2 was fully written in TypeScript and it’s developed by Microsoft. Thus, it becomes the core front-end language for the major tech companies in the world. This let me think that it’ll be probably the standard for front-end development in the incoming years.

1. Type annotation

TypeScript support most basic types you would expected in any language: boolean, number, array, etc.

var isRuning: boolean = false;
var height: number = 6;
var status: string = "stopped";
var myList: number[] = [1, 2, 3];

A helpful addition to the standard set of datatypes from JavaScript is the ‘enum‘ type.

enum Color {White, Red, Green, Blue, Purple, Black};
var c: Color = Color.White;

TypeScript also added a new type Any, as its name indicate it describe a variable which you have no idea about its type at the time of writing code. For this type we have not any type-checking.

var nonTyped: any = True;
nonTyped = "Maybe a string instead";

2. Interfaces

In real world, we need to represent more complex types (aka records). For this purpose, TypeScript define the interface keyword. Below an example from the official intro tutorial (which you can find here):

interface Person {
 firstname: string;
 lastname: string;

function greeter(person : Person):string {
 return "Hello, " + person.firstname + " " + person.lastname;

let user = {firstname: "Jane", lastname: "User"}

document.body.innerHTML = greeter(user);

In the above code, we defined an interface Person and a function greeter which
accept an object person which should implements the Person interface and returns
a string. In other words, the person object should have firstname and lastname properties.

3. TypeScript and OOP

TypeScript comes with awesome batteries that makes OOP easier for developers and manage browser-compatability for you. Below, a simple example:

class Person {
    name: string;
    constructor(name: string) {
        this.name = name;
    whois() {
        return this.name;

var person = new Person("John Doe");

As you see, the syntax is easy to understand; especially if you know Java or C#.

3.1. Inheritance

In addition to being able to define custom classes, you can also extend it by using extends reserved keyword. Below an example:

class Person {
  firstName: string;
  lastName: string;

  fullName(): string {
     return this.firstName + ' ' + this.lastName;

class Student extends Person {
  major: string;

class Teacher extends Person {
  salary: double;

In this example; the Student and Teacher class inherits from the Person class.
As a result, it has access to firstName and lastName properties and to th fullName

3.2. super

super is a reserved typescript keyword, which give you the possibility to call the parent method from child instance. Below an example

class Base {
    log() {
      console.log('hello world');

class Child extends Base {
    log() {

4. Typings

Geerally, we use 3rd-party libraries in our projects, like jQuery / angular.js and TypeScript expect a declaration file— denoted by a .d.ts extension —  where it can find the API definition of your library. In this stage you should know about the gigantic DefinitelyTyped repo, where you’ll find declaration of almost all JavaScript libraries. We won’t go into too much detail here, but if we wanted to use underscore types, for example, we could simply go typings install underscore --save and have them downloaded into a path defined in typings.json. After that, you could use underscore’s type definitions anywhere in your project simply by including this line:

/// <reference path="underscore/underscore.d.ts" />

5. Conclusion

TypeScript is an interesting push toward improving on JavaScript’s shortcomings by introducing a static typing system, complete with interfaces and type unions. This helps us write safer, more legible and declarative code.

Do you fill that you’ll use TypeScript in your next project ? Do you think that it’ll be the future of JavaScript ? Or do you think that it’ll fail ? Let me know what you think below!

Firefox caching and jQuery plugins

If you’re using jQuery dropdown or bootstrap_dropdowns_enhancement plugin, don’t forget to set  autocomplete=”off” in any checkbox or radio input inside the dropdown menu. Otherwise, we’ll find that the browser display a value, but send another one after submitting your form.

What’s happening ?

With normal input, the last value will be remembered by Firefox when you refresh the page, not if you hit Ctrl + F5 which will clear the value. So, your dropdown will display the default value, and Firefox will set the last checked radio / checkbox values.

Practical introduction to web mining: data wrangling

Most of programming work in data analysis project is spend in data preparation stage, and these’s due to the fact that the collected data is not already represented in the required and expected structure for your data processing application. Hopefully, Twitter data is structured, so we’ll not spend a lot of time in this stage.

First thing that we had to do is loading the collected data. There’s nothing special here, we need only the  json python module. Below the code:

import json

def load_tweets(path):
    tweets = []
    with open(path, 'r') as file_stream:
        for line in file_stream:
                tweet = json.loads(line)
    return tweets

tweets_list = load_tweets("PL_tweets.txt")

1. Pandas

Next we will create a pandas DataFrame. Pandas is an open source python library providing high-level data structures and tools for data analysis. Pandas has mainly two data structures types:

  • Series: one-dimensional array containing an array of data and an associated array of index.
  • DataFrame: tabular data structure containing a collection of columns. DataFrame has both a row and column index. In other words, a DataFrame is a collection of Series.

Let’s first explore the tweet structure. If you don’t have an idea about the Twitter API, it’s a good idea to look first to the official documentation before completing this tutorial. Personally, I think that the key attributes of a tweet are:

  • id: the tweet identifier
  • text: the text of the tweet itself
  • lang: acronym for the language (e.g. “en” for english, “fr” for french)
  • created_at: the date of creation
  • favorite_count, retweet_count: the number of favorites and retweets
  • place, coordinates, geo: geo-location information if available
  • user: the author’s full profile
  • entities: list of entities like URLs, @-mentions, hashtags and symbols
  • in_reply_to_user_id: user identifier if the tweet is a reply to a specific user
  • in_reply_to_status_id: status identifier id the tweet is a reply to a specific status

The below code will create a Pandas DataFrame object containing the most usefule tweet’s metadata that we will use in the next post of this series:

import pandas as pd

# create Pandas DataFrame
tweets = pd.DataFrame()

# create some columns
tweets['tweetID'] = [ tweet['id'] for tweet in tweets_list ]
tweets['tweetText'] = [ tweet['text'] for tweet in tweets_list ]
tweets['tweetLang'] = [ tweet['lang'] for tweet in tweets_list ]
tweets['tweetCreatedAt'] = [ tweet['created_at'] for tweet in tweets_list ]
tweets['tweetRetweetCount'] = [ tweet['retweet_count'] for tweet in tweets_list ]
tweets['tweetFavoriteCount'] = [ tweet['favorite_count'] for tweet in tweets_list ]
tweets['tweetGeo'] = [ tweet['geo'] for tweet in tweets_list ]
tweets['tweetCoordinates'] = [ tweet['coordinates'] for tweet in tweets_list ]
tweets['tweetPlace'] = [ tweet['place'] for tweet in tweets_list ] 

# tweeple information 
tweets['userScreenName'] = [ tweet['user']['screen_name'] for tweet in tweets_list ]
tweets['userName'] = [ tweet['user']['name'] for tweet in tweets_list ]
tweets['userLocation'] = [ tweet['user']['location'] for tweet in tweets_list ]

# tweet interaction 
tweets['tweetIsReplyToUserId'] = [ tweet['in_reply_to_user_id'] for tweet in tweets_list ]
tweets['tweetIsReplyToStatusId'] = [ tweet['in_reply_to_status_id'] for tweet in tweets_list ]

Super ! we created our first data frame. Pandas data frame provide a beautiful and rich API, from visualizing to interacting with the dataframe:

  • head(N): returns first N rows
  • tail(N): returns last N rows
  • iteritems(): iterator over (column name, series) pair
  • etc.

The code below will display the first 5 rows in our data frame:

>>> tweets.head(5)

2. Cleaning Data

Unfortunately, the acquired data is usually dirty and have a lot of inconsistencies, which could be duplicated entries, bad values, not normalized values, etc. So, the cleanup process should include mainly:

  • removing duplicate entries
  • strip whitespaces
  • normalize numbers, dates, etc.

The output of this process is a clean dataset: a dataset consisted only of valid and normalized values, and this will ensure that our analysis code WILL NOT CRASH !

2.1 Missing data

If you followed the previous steps in this tutorial, you noticed probably, as shown in the below figure, the NaN values in some columns. NaN is a special value to denote missing data.

Pandas NaN value

fig 1. Missing data

Now, we had to handle this missing values. In fact, we had mainly two options:

  • replacing all NaN values with None
  • treat each column separately. For example, replacing NaN by None for tweetIsReplyToUserId and tweetIsReplyToStatusId columns, and replacing both None and NaN by “Unknown” for userLocation column, etc.

Personally, I will opt to the second option, and I’ll use the fillna method which will fill NaN values by the given value:

# let's handle userLocation column
tweets.userLocation.fillna("Unknown", inplace=True)
# Now let's replace the other NaN values by None
tweets.fillna(lambda: None)

Note that I set inplace argument of fillna method to True explicitly. Otherwise, the userLocation series will not be modified.

2.2 Bad data

If  you took previously a look to the Twitter documentation, you knew probably that the values of the  tweetCreatedAt column are a string representation of a date and time object. We had to convert these values to a datetime object.

You can use strptime function of datetime package which parse a string representation of date and/or time object. But, I prefer to use Pandas’ to_datetime method which will parse and convert the entire series.

tweets.tweetCreatedAt = pd.to_datetime(tweets.tweetCreatedAt)

2.3 Duplicated data

Really, I didn’t expect to have duplicated entries in my dataset. But as the script crashed several time, I wasn’t surprised. Pandas provide some methods to deal with duplicated data. The duplicated method will annotate rows by a boolean specifying if that row is duplicated or not. By default, the row identity is defined by checking all columns, but you can restrict it on specific columns. For our example, we can specify only the tweetID  column as it’s a unique ID for the tweet.

>>> tweets.duplicated(['tweetID'],
0        False
1        False
2        False
3        False
4        False
5        False
6         True
7        False
8        False
9        False
10       False
11       False

You can drop duplicated rows using drop_duplicates method, as below:

>>> tweets.drop_duplicates(['tweetID'],


I think that I spoke about the most important tips/steps on data wrangling stage. But you had to not that twitter data is structured and clean but this not the regular case. In fact, real-world data is dirty: you had to do more work on it to be able to use it.

Waiting for your comments and suggestions.

Practical introduction to web mining: collect data

Web mining is the application of natural language processing techniques to web content in order to retreive relevant information. It  became more important these days due to an exponential increase in digital content especially with the apperance of social media platforms, especially Twitter which constitue a rich and fiable information source.

In this series, I’ll explain how to collect twitter data, manipulate it and extract knowledge from it. As I am fan of Python, I’ll try to compare Python to other programming languages such as Java, Ruby and PHP based on information that we will collect from twitter.

In this tutorial, we will start by collecting data from twitter, introduce tweepy and the structure of twitter data.

1. Create a Twitter application

First of all, you should have some Twitter keys to be able to connect to twitter API and gather data from it. We need especially API key, API secret, Access token and Access token secret. To get this informations, follow steps bellow:

  1. go to https://apps.twitter.com and login with your twitter account.
  2. Create a new Twitter application
  3. In the next page, precisely in the “API keys” tab you can find both API key, API secret
  4. Scroll down and generate you access token and token secret


Once you created a new Twitter app and generated your keys, you can move to the next step and start collecting data.

 2. Getting Data From Twitter

We will use the Twitter Stream API to collect tweets related to 4 keywords: python, java, php and ruby. Happily, the Twitter Stream API is restful and give us the possibility to filter tweets by keywords. The code below, will fetch popular tweets that contains one of the keywords mentioned earlier:

# -*- coding: utf-8 -*-

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream

# User credentials for Twitter API 
access_token = "ENTER YOUR ACCESS TOKEN"
access_token_secret = "ENTER YOUR ACCESS TOKEN SECRET"
consumer_key = "ENTER YOUR API KEY"
consumer_secret = "ENTER YOUR API SECRET"

class StdoutListener(StreamListener):

    def on_data(self, data):
        print data
        return True

    def on_error(self, status):
        print status

if __name__ == '__main__':
    # Twitter authetification
    listner = StdoutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    stream = Stream(auth, listner)

    # Filter Twitter Streams to capture data by the keywords
    stream.filter(track=['python', 'java', 'php', 'ruby'])

Now if you run this command:

python get_tweets.py >>PL_tweets.txt

you’ll have information about most popular tweets containing one of the keywords python, java, php and ruby in the specified txt file.

3. Understand Twitter response

The data collected previously is in JSON format, so it’s easy to read and understand. But, I’ll take the time here to highlight some useful informations inside the twitter response .


As you propably noticed, the tweet contains information about the tweeple, list of tags and URIs appeared in the tweet, the main text of the tweet, retweet count, favourite count, etc.

Awesome, now you should start collect data. Next posts of this series will be hot and exciting, and you need a lot of data for it: more data, better experience.

Stay tuned ….

Update your git repositories at once

If you has/use multiple git repositories, here is a CLI tool that will allows you to update multiple repositories at once, it’s intuited GitupGitup is a cross-platform tool designed to update a large number of git repositories at once, and it’ can manage different remotes, branches, etc

To install Gitup :

git clone git://github.com/earwig/git-repo-updater.git
cd git-repo-updater
sudo python setup.py install

If all your repositories are in the same directory (here repos), just launch gitup like this:

gitup ~/repos

and you get the latest files version at once. You can also bookmark your repositories with the parameter –add

gitup --add ~/repos
gitup --add ~/repos2

and so do your git pull just with running the bellow command:


Practical, isn’t ? For further details, see the github repository.

Scheduled jobs with Celery, Django and Redis

Setting up a deferred task queue for your Django application can be a pain and it shouldn’t to be. Some “persons” use cron which is not only a bad solution, but this is a disaster. Personally, I use Celery. In this post, I’ll show you how to set-up a deferred task queue for your Django application using Celery.

What’s Celery ?

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

The promise of Celery is to allow you to run code later, or regularly according to a schedule. Unfortunately, running deferred tasks through Celery is not trivial. But it’s useful and beneficial, as it has a distributed architecture that scales as you need. Any Celery installation is composed of three core components:

  1. Celery client: which used to issue background jobs.
  2. Celery workers: these are the processes responsible to run jobs. Worker can be local or remote, so you can start with a single worker in the same web application server, and later add workers as your traffic and overload grow.
  3. Message broker: The client communicates with the the workers through a message queue, and Celery supports several ways to implement these queues. The most commonly used brokers are RabbitMQ and Redis.

Installing requirements

Fistable, let’s install Redis:

$ sudo apt-get install redis-server

Now, let’s install some python packages:

pip install celery
pip install django-celery

Configuring Django for Celery

Once the installation is completed, you’re ready to set up our scheduler. Let’s configure Celery:


BROKER_URL = 'redis://'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'

The above lines is used to configure Celery: which broker you’ll use? Which scheduler for heart beat event ?

As you added djcelery package to your INSTALLED_APPS, you need to create the celery database tables – instructions for that differ depending on your environment, If using South or Migrations (Django >= 1.7) for schema migrations:

$ python manage.py migrate


$ python manage.py syncdb

Below, the celery.py file that is used for setting up the scheduler for your django project:

# celery.py file
from future import absolute_import

import os
import django

from celery import Celery
from django.conf import settings

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'demo.settings')

app = Celery('Scheduler')

app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

Write some tasks

Let’s assume that you have a task that should be executed periodically, a good example might be a twitter bot or a scraper.

import tweepy

api = tweepy.API()

def get_recent_tweets(query):
    for tweet in tweepy.Cursor(api.search, q=query,
                               rpp=100, result_type="recent",
        print tweet.created_at, tweet.text
        # Save tweet into database

Now, we need to create a Celery task for get_recent_tweets

    ## /project_name/app_name/tasks.py

    from celery.decorators import task

    from utils import twitter

    def get_recent_tweets(*args):
        # Just an example

N.B: Things can get a lot more complicated than this.

Scheduling it

Now, we have to schedule our tasks. For get_bigdata_tweets task, we will run it every hour, this is an interesting subject that I want to follow, For this purpose, I’ll use celery.beat scheduler. In settings.py file add this code:

from celery.schedules import crontab

CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
    "get_bigdata_tweets": {
        'task': "bots.twitter.tasks.get_recent_tweets",
        # Every 1 hour
        'schedule': timedelta(seconds=6),
        'args': ("bigdata"),
For further details, about scheduler configuration, see documentation.