Hackathon 0x1 – Pimp my Review or the Epic Birth of a Gerrit Plugin

This series of articles describes some of the best realisations made by Intersec R&D team during the 2-day Hackathon that took place on the 3rd and 4th of July.

The goal had been set a day or two prior to the beginning of the hackathon: we were hoping to make Gerrit better at recommending relevant reviewers for a given commit. To those who haven’t heard of it, Gerrit is a web-based code review system. It is a nifty Google-backed open-source project evolving amid an active community of users. We have been using this product here at Intersec since 2011 and some famous software projects also rely heavily on it for their development process.

Da Review Pimpers

This would be a good metaphor to illustrate our mindset at the beginning of the hackathon! (credits: Team Fortress 2)

Our team consisted of five people: Kamal, Romain, Thomas, Louis and Romain (myself).

Continue reading

Memory – Part 6: Optimizing the FIFO and Stack allocators

Introduction

The most used custom allocators at Intersec are the FIFO and the Stack allocators, detailed in a previous article. The stack allocator is extremely convenient, thanks to the t_scope macro, and the FIFO is well fitted to some of our use cases, such as inter-process communication. It is thus important for these allocators to be optimized extensively.

We are two interns at Intersec, and our objective for this 6 week internship was to optimize these allocators as far as possible. Optimizing an allocator can have several meanings: it can be in terms of memory overhead, resistance to contention, performance… As the FIFO allocator is designed to work in single threaded environments, and the t_stack is thread local, we will only cover performance and memory overhead.

Continue reading

Hackathon 0x1 – Interactive mode in Behave

This series of articles describes some of the best realisations made by Intersec R&D team during the 2-day Hackathon that took place on the 3rd and 4th of July.

Presentation of the Project

As testers, we spend a lot of time working on behave, our test automation framework1. Our test framework is a great tool, but it takes a lot of time starting and initializing the product, running tests one by one.

We are not only test automation developers. We also need to explore the product under test by experimenting again and again based on the information we gather along the way. Manual testing is the basic approach to perform non-trivial aufblasbare rutsche experiments, but it can benefit from automated testing as it offers a quick and reliable way to set up a product in any given state.

The hackathon2 was the perfect opportunity to buy ourselves a new exploratory tool to perform interactive automated testing. To do that, we needed to be able to:

  • Gain more control over the set up steps
  • Pause the product in order to manually test the state of the product
  • Select some more steps to run, check again, and so on
  • Explore with some automatic help

We elaborated an interactive mode to manage the run of a scenario. The goal was to be able to play each sentence on demand. This mode would help us in the future to reproduce issues and to set up the environment faster for testing purpose or even for customer demonstration.

Continue reading

  1. behave is a clone of cucumber written in Python. It is based on the BDD (Behavior Driven Development) principles. Tests are described as a succession of english-sentences (assumptions, then actions, then results) which are themselves mapped to the corresponding Python code. []
  2. a hackathon is an event in which computer programmers and others involved in software development, including graphic designers, interface designers and project managers, collaborate comprar carpa hinchable baratos intensively on software projects. Intersec promotes this event to focus and enhance project innovation []

DAZIO: Detecting Activity Zones based on Input/Output sms and calls activity for geomarketing and trade area analysis

Introduction

Telecom data is a rich source of information for many purposes, ranging from urban planning (Toole et al., 2012), human mobility patterns (Ficek and Kencl, 2012; Gambs et al., 2011), points of interest detection (Vieira et al., 2010), epidemic spread modeling (Lima et al., 2013), community detection (Morales et al., 2013) disaster planning (Pulse, 2013) and social interactions (Eagle et al., 2013).

One common task for these applications is to identify dense areas where many users stay for a significant time (activity zones), the regions relaying theses activity zones (transit zones) as well as the interaction between identified activity zones. Thus, in the present  article we will identify activity and transit zones to monitor and predict the comprar parque hinchable baratos activity levels in the telecom operators network based on the SMS and calls input/output activity levels issued from the Telecom Italia Big Data Challenge. The results of the present study could be directly applied to:

  • Location-Based Advertising
  • defining a suitable place to open a new store in a city
  • planning where to add cell towers to improve QoS

The contribution of this work is twofold: to present a model accounting for changes of activity levels (over time) and to predict those changes using Markov chains. We also propose a methodology to detect activity and transit zones.

Continue reading

More about locality

In the third post of the memory series we briefly explained locality and why it is an important principle to keep in mind while developing a memory-intensive program. This new post is going inflatable obstacle course to be more concrete and explains what actually happens behind the scene in a very simple example.

This post is a follow-up to a recent interview with a (brilliant) candidate1. As a subsidiary question, we comprar castillo hinchable baratos presented him with the following two structure definitions:

 struct foo_t {
    int len;
    char *data;
};
 struct bar_t {
    int len;
    char data[];
};

The question was: what is the difference between aufblasbarer hindernisparcours these two structures, what are the pros and the cons of both of them? For the remaining of the article we will suppose we are working on an x86_64 architecture.

By coincidence, an intern asked more or less at the same time why we were using bar_t-like structures in our custom database engine.

Continue reading

  1. we’re still hiring []

Memory – Part 5: Debugging Tools

Introduction

Here we are! We spent 4 articles explaining what memory is, how to deal with it and what are the kind of problems you can expect from it. Even the best developers write bugs. A commonly accepted estimation seems to be around of few tens of bugs per thousand of lines of code, which is definitely quite huge. As a consequence, even if you comprar castillos inflables baratos proficiently mastered all the concepts covered by our articles, you’ll still probably have a few memory-related bugs.

Memory-related bugs may be particularly inflatable games hard to spot and fix. Let’s take the following program as an example:

#include <stdio.h>

#define MAX_LINE_SIZE  32

static const char *build_message(const char *name)
{
    char message[MAX_LINE_SIZE];

    sprintf(message, "hello %s!\n", name);
    return message;
}

int main(int argc, char *argv[])
{
    fputs(build_message(argc > 1 ? argv[1] : "world"), stdout);
    return 0;
}

This program is supposed to take a aufblasbare spiele message as argument and print “hello !” (the default message being “world”).

The behavior of this program is completely undefined, it is buggy, however it will probably not crash. The function build_message returns a pointer to some memory allocated in its stack-frame. Because of how the stack works, that memory is very susceptible to be overwritten by another function call later, possibly by fputs. As a consequence, if fputs internally uses sufficient stack-memory to overwrite the message, then the output will be corrupted (and the program may even crash), in the other case the program will print the expected message. Moreover, the program may overflow its buffer because of the use of the unsafe sprintf function that has no limit in the number of bytes written.

So, the behavior of the program varies depending on the size of the message given in the command line, the value of MAX_LINE_SIZE and the implementation of fputs. What’s annoying with this kind of bug is that the result may not be obvious: the program “works” well enough with simple use cases and will only fail the day it will receive a parameter with the right properties to exhibit the issue. That’s why it’s important that developers are at ease with some tools that will help them to validate (or to debug) memory management.

In this last article, we will cover some free tools that we consider should be part of the minimal toolkit of a C (and C++) developer.

Continue reading

Memory – Part 4: Intersec’s custom allocators

malloc() is not the one-size-fits-all allocator

malloc() is extremely convenient because it is generic. It does not make any assumptions about the context of the allocation and the deallocation. Such allocators may just follow each other, or be separated by a whole job execution. They may take place in the same thread, or not… Since it is generic, each allocation is different from each other, meaning that long term allocations share the same pool as short term ones.

Consequently, the implementation of malloc() is complex. Since memory can be shared by several threads, the pool must be shared and locking is required. Since modern hardware has more and more physical threads, locking the pool at every single allocation would have disastrous impacts on performance. Therefore, modern malloc() comprar tobogan hinchable baratos implementations have thread-local caches and will lock the main pool only if the caches get too small or too large. A side effect is that some memory gets stuck in thread-local caches and is not easily accessible from other threads.

Since chunks of memory can get stuck at different locations (within thread-local caches, in the global pool, or just simply allocated by the process), the heap gets fragmented. It becomes hard to release unused memory to the kernel, and hüpfburg kinder it becomes highly probable that two successive allocations will return chunk of memories that are far from each other, generating random accesses to the heap. As we have seen in the previous article, random access is far from being the optimal solution for accessing memory.

As a consequence, it is sometimes necessary to inflatable have specialized allocators with predictable behavior. At Intersec, we have several of them to use in various situations. In some specific use cases we increase performance by several orders of magnitude.

Continue reading

Memory – Part 3: Managing memory

Developer point of view

In the previous articles we dealt with memory classification and analysis from an outer point of view. We saw that memory can be allocated in different ways with various properties. In the remaining articles of the series we will take a developer point of view.

At Intersec we write all of our software in C, which means that we are constantly dealing with memory management. We want our developers to have a aufblasbares zelt solid knowledge of the various existing memory pools. In this article we will have an overview of the main sources of memory available to C programmers on Linux. We will also see some rules of memory management that will help you keep your program correct and efficient.

Continue reading

Memory – Part 2: Understanding Process memory

From Virtual to Physical

In the previous article, we introduced a way to classify the memory a process reclaimed. We used 4 quadrants using two axis: private/shared and anonymous/file-backed. We also evoked the complexity of the sharing mechanism and the fact that all memory is basically reclaimed to the kernel.

Everything we talked about was virtual. It was all about reservation of memory addresses, but a reserved address is not always immediately mapped to physical memory by the kernel. Most of the time, the kernel delays the actual allocation of physical memory until comprar carrera obstaculos hinchables baratos the time of the first access (or the time of the first write in some cases)… and even then, this is done with the granularity of a page (commonly 4KiB). Moreover, some pages may be swapped out after being allocated, that means they get written to disk in order to allow other pages to be put in RAM.

As a consequence, knowing the actual size of physical memory used by a process (known as resident memory of the process) is really a hard game… and the sole component of the system that actually knows about it is the kernel (it’s bounce house with slide even one of its jobs). Fortunately, the kernel exposes some interfaces that will let you retrieve some statistics about the system or a specific process. This article enters into the depth of the tools provided by the Linux ecosystem to analyze the memory pattern of processes.

Continue reading

Memory – Part 1: Memory Types

Introduction

At Intersec we chose the C programming language because it gives us a full control on what we’re doing, and achieves a high level of performances. For many people, performance is just about using as few CPU instructions as possible. However, on modern hardware it’s much more complicated than just CPU. Algorithms have to deal with memory, CPU, disk and network I/Os… Each of them adds to the cost of the algorithm and each of them must be properly understood in order to guarantee both the performance and the reliability of the algorithm.

The impact of CPU (and as a consequence, the algorithmic complexity) on performances is well understood, as are disk and network latencies. However inflatable slide the memory seems much less understood. As our experience with our customers shows, even the output of widely used tools, such as top, are cryptic to most system administrators.

This post is the first in a series of five about memory. We will deal with topics such as the definition of memory, how it is managed, how to read the output comprar hinchables juegos baratos of tools… This series will address subjects that will be of interest for both developers and system administrators. While most rules should apply to most modern operating systems, we’ll talk more specifically about Linux and the C programming language.

Continue reading