Factory Boy

Hoy he empezado a utilizar una nueva herramienta para un proyecto de Django en el que estoy trabajando: Factory Boy.

La descripción del proyecto lo dice bastante claro: es una herramienta para ayudar en la creación de fixtures para tests, es decir, conjuntos de datos de prueba sobre los que correr tests de código automáticos.

En un proyecto de código real, es fácil que nos encontramos dos cosas:

1. Tendremos modelos con un montón de campos, bastantes de ellos obligatorios.

2. El volumen de relaciones entre distintos modelos puede ser alto. Por ejemplo, un libro puede tener una FK hacia un autor, y el autor a su vez una FK hacia su país de nacimiento, etc.

¿A qué nos lleva esto? Que para poder probar cualquier cosa en nuestros tests automáticos necesitamos un conjunto relativamente grande y pesado de objetos de prueba (la fixture) sobre los que correr la mayoría de los tests. Para solucionar este problema, lo que yo normalmente hacía era crear una fixture básica principal sobre la que corría todos los tests luego.

Esto tiene el inconveniente básico de que todos los tests que usen esa fixture tendrán que recrear cada vez esos objetos (porque Django hace un necesario flush de las BBDD antes de cada nuevo test), lo que lleva mucho tiempo.

Uno puede pensar en montar una jerarquía de fixtures para que cada conjunto de tests sólo cree (más o menos) los objetos que necesite. En mi opinión, esto puede ser una pesadilla de mantener y, por lo tanto, una razón más para acabar no escribiendo tests automáticos, por lo que yo no lo recomiendo.

Aquí es donde entra Factory Boy, que a mi juicio tiene básicamente dos ventajas muy importantes:

1. Permite poner valores por defecto (incluso dinámicos) para los distintos campos del modelo, de forma que al crearlo sólo hay que especificar los valores que no queremos que sean los por defecto.

Esto puede parecer una tontería, pero si se tienen muchos campos obligatorios es una bendición. Especialmente cuando se añaden campos nuevos a los modelos: en vez de tener que cambiarlo en todas partes, basta con cambiarlo en la definición de la factoría y punto.

2. Permite crear objetos en cadena. Es decir, si el modelo libro tiene una FK a un autor obligatoria, se pueden especificar los atributos deseados del libro y el autor se creará solo con los valores por defecto que hayamos especificado.

Aquí es donde se ahorra también un montón de tiempo, ya que podemos crear fácilmente los objetos que se necesitarán en cada test al inicio de la función del propio test, en vez de hacer una función setUp compartida para todos los tests que cree objetos que no necesitan todos los tests.

En resumen, una herramienta muy útil que recomiendo a cualquiera que haga TDD con Python, y especialmente con Django.

SomSants, la xarxa social del barri

[Article publicat a La Burxa n169 (gener 2013), escrit pel Chris i jo]

SomSants és una xarxa social pel barri de Sants. Una eina web de comunicació oberta, creada amb programari lliure i gestionada de molt aprop, per nosaltres, per tu. L’us és semblant al Twitter; pots enviar missatges petits explicant què estàs fent, contestar altres missatges, citar reunions i esdeveniments, compartir fitxers i enllaços, etc.

SomSants neix de l’observació de que les llistes de correu que normalment fem servir no promouen la difusió ni conviden a la participació: el que es desenvolupa a cada grup queda aillat de la feina que es fa en els altres. Una xarxa social com SomSants és un complement natural de les llistes de correu, ja que presenta la informació en un format accessible per a tothom, fent que els missatges arribin més lluny, enriquint el teixit associatiu, col·laboratiu i social del barri i evitant a més a més convertir les nostres dades en mercaderia a les mans de xarxes comercials com ara Facebook, Twitter o Tuenti.

Entra-hi ara i dóna un cop d’ull al que es mou pel barri! No cal registrar-s’hi per a llegir la informació, però si ho fas en podràs publicar, subscriure’t a les actualitzacions d’altres, treballar en grup, enviar missatges privats, etc. Uneix-t’hi!!

Spam filtering – A Data Mining Project

My workmate Alessandro and I recently finished a project report on spam filtering using basic Machine Learning algorithms for the Data Mining course here at UPC. Here’s the abstract and complete report in PDF.

Spam is one of the major threats in the use of electronic mail nowadays. It consumes precious connection bandwidth, slows down mail servers and wastes people’s time. In this project we study the performance of a variety of supervised-learning models including Logistic Regression, Naïve Bayes, Neural Networks, K-NearestNeighbours, and Quadratic and Linear Discriminant Analysis as well as their behaviour after a number of feature selection methods. Finally, we briefly compare our results with those of some previous studies using our same database.

Spam filtering – PDF

The Humble Indie Bundle 2

Hace unos días llegué gracias a Reddit a este kit de juegos alternativos. Me pareció interesante nada más verlo: cinco juegos completos por un precio variable a voluntad, que funcionan en todas las plataformas (Linux, Windows y Mac) y se pueden descargar directamente de Internet. Además, si pagabas un precio superior a la media (que se recalculaba a medida que se recibían pagos), podías acceder también a los juegos de The Humble Indie Bundle 1, un primer kit con el mismo concepto de venta. Y sin DRM.

Con la media en 7 dólares y algo, yo pagué 10, pero en la lista de contribuciones aparecen donativos de más de 6.000 dólares. No sé si el negocio les saldrá rentable, pero como idea experimental me parece genial.

Los juegos son variados y raros, como el propio “indie” indica. Lo mejor para un rápido resumen gráfico es ver este vídeo, pero aquí comentaré mi punto de vista. Sin lugar a dudas, el que más me ha gustado de los que he probado es Braid. Se trata de un muy bien recibido juego plataformas en 2D en el que controlas un pequeño tipejo con traje llamado Tim en búsqueda de su amada princesa, recogiendo unas piezas de puzle que hay esparcidas por los distintos niveles del juego. La novedad es que se puede viajar hacia atrás en el tiempo. Al principio parece que simplemente sirve para poder reaparecer cuando te matan (porque te chocas con una cabeza andante o un conejito asesino), pero enseguida entiendes que es la parte principal del juego, pues es absolutamente vital para resolver todos los puzles que se plantean. Hay seis mundos distintos, y en cada mundo el tiempo se comporta de una forma distinta, cada cual más chula. Los gráficos están bastante guays y la música también. Además, la historia del rollo de la princesa es muy curiosa.

Una vez que te lo has acabado, lo que se traduce en un día o dos como mucho, ya no tiene mucha más chicha, pero merece mucho la pena jugar por lo ingenioso de los puzles. Otro juego que parecía muy chulo pero a mí no me ha gustado demasiado es Cortex Command. En él controlas una serie de robots que pueden comprar armas, y se supone que la física de cómo se mueven las cosas está muy currada, pero yo no he conseguido pillar bien el control del juego y me parece muy, muy aparatoso.

Osmos es otro juego raro raro. En él controlas una organismo en forma de círculo que tiene que moverse por la pantalla comiéndose organismos más pequeños que él y evitando los más grandes y la antimateria. Para moverte tienes que eyectar materia en una dirección, lo que te va haciendo más pequeño, así que hay que economizar e impulsarse lo mínimo necesario. Al principio el resto de organismos son inertes, pero luego empiezan a aparecer otros organismos (artificialmente) inteligentes que huyen de ti o te intentan comer (si son más grandes que tú en ese momento), haciéndolo verdaderamente difícil.

Revenge of the Titans es muy divertido. Tú asumes el rol del comandante de defensa de ciudades ante la inminente embestida de hordas de titanes, que no son más que pequeños monigotes con apariencia de come-cocos. Para defender cada ciudad tienes que construir las llamadas “blasters”, una especie de armas automáticas que disparan cuando se les acerca el bicho. En cada misión puedes investigar nuevas tecnologías, que te permiten comprar armas y power-ups mejores, y los bichos que vienen se van haciendo más fuertes y rápidos. El principal inconveniente que le veo al juego es que se acaba volviendo muy, muy difícil, porque los titanes vienen de todas partes y lo destrozan todo, y no te da tiempo a hacer todo lo que tienes que hacer en el juego, y te quedas sin un duro. Los gráficos son muy adecuados (chulis) para el juego, y la mar de divertidos.

Respecto a los que he jugado del primer Bundle, uno que destaca es Gish, donde controlas una bola de alquitrán que tiene que ir desplazándose por distintos niveles de plataformas, donde hay pinchos y bichos que te quieren morder. El modo de un solo jugador la verdad es que cansa, porque es siempre lo mismo, pero el modo multijugador en ordenador compartido es genial para echarse unas risas con los amigos, en los modos de pelea o sumo, de dos a cuatro jugadores. Yo me lo paso teta con mi hermano. :)

World of Goo tiene unos gráficos bastante guays y un concepto de juego curioso. Es algo así cómo los Lemmings, porque tienes que hacer llegar unas bolitas de mugre a su destino (una tubería), pero el sistema es ir creando una estructura fija con la mugre para alcanzar la tubería. Lo único que da la sensación de estar diseñado más para matar el tiempo con el iPad (y sucedáneos) mientras vamos en el bus o esperamos la cita con la princesa, más que para jugar en tu casa durante mucho tiempo.

Lugaru HD es un juego de peleas de conejos un poco estilo Matrix, que al principio parece muy chulo, pero a mí me parece que tiene los controles demasiado aparatosos, y los gráficos muy sosillos. Lo más gracioso, el sonido que emiten los conejos al hablar y las hondanadas de hostias que se reparten.

Quedan algunos a los que he jugado poco o nada, pero en conjunto merecen con creces los 10 dólares que pagué. Enhorabuena por la iniciativa y ¡que salga pronto el próximo Bundle!

Plone 3.3 Site Administration Review

Packt Plone books strike again! Written by well-known Alex Clark and technically reviewed by the re-incident Steve McMahon, Plone 3.3 Site Administration comes to my e-shelf. Being Alex the most dedicated plone.org administrator, you can’t expect him to be wrong at how to manage a Plone site. :)

While the book target audience is claimed to be everyone interested in becoming more familiar with how to professionally manage Plone sites, I’ve found most of the book very, very basic. If you know how to use a terminal, a text editor and a browser, you’re likely not going to have many problems following the detailed tiny-step-by-tiny-step instructions provided in the book. However, the reader might feel sometimes like a script-kiddie, executing commands and adding sections to his/her buildout without really understanding fully what he/she’s doing (and why) and thus unable to confidently change the configuration. This is specially true in the last chapters of the book.

The writing style is always casual and easy. Alex gets directly to the point without much bla-bla. The downside is that Alex sometimes uses some concepts (like Five, FSDVs or CMF) in the book without previous introduction or pointers to further documentation. But of course, you can always rely on Google. For some questions the reader might have, Alex has opted for a short-answer/medium-answer/long-answer schema that, while the division is not always perfect, helps the reader to decide how in deep does he/she want to go.

The book is a gentle introduction to buildout and product installation (including basic theming) for absolute beginners, and that’s what the first half of the book is all about, but I had expected a longer treatment of load balancing schemes, cache proxies and settings for optimal performance, load testing, multimedia streaming, development-production products and buildout deployment, apache/nginx configuration for Plone, multiple ZODB mount-points and ZEO configuration, among others. These are the kind of things I would expect an advanced Plone site administrator to master, and what we need proper, comprehensive documentation for.

Summing up, if you fall inside the target audience outlined in the paragraph above, you’re going to like this book. If you’re looking for more hard-core site administration stuff, check out Planet Plone and other online docs.

Organizing a competition with Plone

The following story is an attempt to show an example of how one can work with Plone in real world project. It’s based on a real product whose development I contributed to: acentoweb.competition.

The requisites

The client wanted a system to manage and organize online photo and video competitions. The competitions’ announcements and rules would be published in the site. The people who wanted to participate in a certain competition would register on it, entering their personal data, and then be able to submit photos and videos to be evaluated for the competition.

Participants should only be able to see their own submissions, and never the ones from the other participants, until the submission time is finished. The judge of every competition, a group of designed people, should be able to see all the submited items, but not their owners. The members of the judge should also be able to rate or reject each submission.

The participants shouldn’t be able to edit or add elements in the rest of the site, only in the competions they had signed up, and the sign-up form should look like if they were signing up for a competition, not like if they were creating an account for a normal Plone site.

The proposed solution

This is how we decided to implement the product in Plone. Surely there might be smarter ways, so comments to improve the product are appreciated. :)

First, we decided to create a folderish Archetypes content-type to represent a Competition, with classic title and description, and rules. Folderish because it would hold photos and videos. The photos would be just a copy of the Image type, and the videos of the File type, perhaps including some integration with p4a. Having special Photo and Video content-types ensures we can assign a custom workflow to them, as we actually need.

To ensure that the participants can only see their own submissions during the competition submission period, we create a special workflow for them, competition_item_workflow, with three states:

  • Private: The participant is still preparing the item. Only he can see and modify it.
  • Pending: The item is waiting for the judge evaluation. The participant can’t modify the item anymore. The judge can see and evaluate it now.
  • Published: The competition has ended and the items is marked for public display.

Only the participant (the owner) can trigger the submission of an item, and the judge can publish it later, which is implemented using role guards in the respective transitions.

Competitions themselves also have a dedicated workflow, competition_workflow, with states:

  • Private: In preparation. Can only be seen by the owner and the users he/she allows manually.
  • Open: The rules are published and participants can sign-up and submit their works.
  • Closed: The competition doesn’t accept new work submissions and the results are published.

Since a single person can participate in more than one competition, we decided to make the participants create a user in the site and sign-up later in each competition individually. To do so we:

  • Create a customized copy of the Plone join_form, with fields for location, phone number, and other personal data they need to enter when they sign-up.
  • Customize the Competition view to include a “Sign-up for this competition!” button, which would grant the “Competitor” local role to the user, which in turn would grant him rights to create and submit items for the competition.

To hide the author info from the judge, we customize the plone.documentbyline viewlet to hide it for users without the Modify portal content permission over an object. It’s not the most optimal solution perhaps, but it just works for now.

The judge for each competition is assigned manually for the managers of the site, assigning the “Reviewer” local role to individual users via the Sharing tab.

The rating is implemented via plone.contentratings. We created a custom category with a custom rating manager, since the default one wasn’t working properly with the permission settings we set for rating and reading the ratings: competitors and the judge can’t see the ratings of submissions before the competition is closed, and the judge can only rate works while the competition is open.

On Deco and Tiles – the big picture

I’ve been working these months in a Google Summer of Code project entitled Core tiles development. One thing I wanted to do is to write some documentation about how the whole Deco/Blocks/Tiles system works together – the reason is that there are a lot of packages and moving pieces involved and it’s easy to get lost trying to understand what does what and in which order. I won’t try to explain in detail how does each package do its work (read each package documentation if you’re interested) but to introduce the different packages involved.

In short, Deco is a page composition system based on semantic HTML and a grid system. Instead of using custom XML namespaces and a templating language (like METAL), Deco uses plain (strategic) HTML.

To add this feature to a Dexterity content-type you just have to add the plone.app.layoutbehavior Dexterity behavior to it. This behavior adds two fields to the content-type: layout, to select the site layout you want to use, and content. This last field will contain all the tile-related HTML markup, and is populated by default with two field tiles: title and description. The Dexterity type with the cited behavior we’re currently using is named Page and lives into plone.app.page.

If the Deco UI package, plone.app.deco, is installed, it will detect the presence of the content field and activate. The Deco UI allows you to insert, drag-and-drop, edit and delete tiles inside the content field of a type.

To position the tiles in the screen, the Deco UI makes use of the so-called Deco Grid System, a bunch of carefully crafted CSS classes that, when applied to div elements, position them in the page with the appropiate dimensions.

Tiles are little more than browser views with associated configuration data, and their base classes live in plone.tiles. We have transient tiles, which store the configuration data in a querystring in the tile HTML, like:

http://host.org/@@plone.app.standardtiles.helloname/tile-1?name=Israel

and persistent tiles, for config data not encodeable into querystrings (e.g. a large file), which store the data in the ZODB as annotations in the content object. Note the expected ‘@@’ for browser views — when this URL is accessed, it will return an HTML page with headers and a body, like:

<html>
  <head>
    <link rel="stylesheet" type="text/css" href="names.css" />
  </head>
  <body>
    <p class="aName">Hello Israel!</p>
  </body>
</html>

The plone.app.tiles (notice the “app” namespace) package registers the helper views @@add-tile, @@edit-tile and @@delete-tile to do exactly what their names say, the two first ones using a form generated from the tile data schema via plone.autoform.

So how are tiles actually rendered into a page? The answer resides in plone.app.blocks. This package is in charge of loading the page layout (remember the “layout” field added by plone.app.layoutbehavior?), merging in the contents of the page and “expanding” the tiles, merging the head of the tile into the head of the resulting page and putting its body where the placeholder for the tile was, as detailed in the documentation.

Finally, the basic tiles to be inserted, including image, video, attachment, navigation tree and searchbox among others, live in plone.app.standardtiles.

Muchas gracias por vuestras respuestas.

Finalmente vamos a intentar quedar días antes de empezar el mes de alquiler para firmar el contrato, pagar la fianza en metálico y recibir las llaves: todo a la vez.

Además, vamos a pedir a nuestro (futuro) casero que nos envíe una copia del texto del contrato antes, para poder tener tiempo de discutirlo si no estuviésemos de acuerdo en algo.

Por otra parte, buscando información por la Red he encontrado este par de enlaces, bastante útiles:

http://www.spaviv.es/informacion/normativa.php
http://www.upv.es/perfiles/estudiante/documentos/alojamientos_triptico.pdf

Plone 3 Multimedia review

Plone 3 Multimedia cover

Time for a new review of a Plone book! This time it’s Plone 3 Multimedia, by Tom Gross, and published by, guess who… Packt Publishing! One would say that Packt has a really good marketing team. :P

The mistake in the title is here strikes once again, since most of the book, if not it all, will apply also for Plone 4, but Packt continues following this policy.

First think I thought was… do we really need a whole book about multimedia in Plone? The answer is, well, there is enough material, enough multimedia-related products for Plone out there to write a book about the topic if you want to.

One thing I don’t understand is what the target audience is supposed to be. The “Who this book is for” section claims that (please Packt don’t sue me for copyright-related issues ;):

This book is for Plone integrators who want to extend the core of Plone with multimedia features. It gives no introduction to Plone and readers should know how to set up a Plone site using a buildout. The book can be read and understood well even if the reader is not a Python developer, though some examples have Python code included.

The book starts giving definitions of what a CMS or what multimedia is and the different types of multimedia elements we can stumble upon, so you think it’s going to be soft, but it soon dives into using multimedia in Zope Page Templates and Python code, and later uses some more advanced concepts (e.g. automated testing, traversers, marking interfaces, zope events…) without (IMO) proper introduction.

Is not that I can’t accept the reader is required to have some former Plone knowledge — what I don’t understand is the mixture of really-newbies with more advanced coding material.

I would have appreciated a kind of requirements story to give more coherence to the content as a whole, something like what happens in the Plone 3 Products Development Cookbook or Proffessional Plone Development: a fictitious client that presents some requirements for a to-be-developed Plone site.

Plone 3 Multimedia doesn’t follow this pattern, and the result is a different structure, a reference presenting and briefly explaining different products to add multimedia features to your site, like the whole Plone 4 Artists (p4a) suite, plonetruegallery, Slideshowfolder, collective.flowplayer, Plumi, Vice, collective.uploadify or Red5, among others.

The two last chapters deal with what I think are vital topics when dealing with multimedia: storage and caching. In the storage one, I miss some more guidance about which storage system choose in every situation and why, instead of just a list of different available products with storage-related features.

Finally, I don’t think the appendices, covering multimedia and syndication formats, licenses and links for getting more help, are worth it. We already have Wikipedia, Google searches and all, so if one wants to read about, say, Ogg Vorbis, one ends here or here, with a lot more info that what one can find in the corresponding appendix of the book. All these pages could have better been employed in explaining more deeply the more advanced technical concepts, for example.

To sum up, I find this book good for “advanced” integrators or developers who are looking for an overview of the different available multimedia products for Plone. For the rest, I’ve not doubt you can learn something from it, but perhaps others fit your profile better.

Free Culture X Conference notes

The Free Culture X Conference and Unconference took place the past 13th and 14th of February at the George Washington University, in Washington, DC. I had the privilege to attend thanks to a travel grant from the generosity of Google, Mozilla and Shareable.

As they define it, its vision is to bring together student activists and free culture luminaries to discuss free software and open standards, open access scholarship, open educational resources, network neutrality, and university patent policy, especially in the context of higher education.

Below there is a summary of the notes I took during the conference. It doesn’t aim to be complete nor precise, but I hope it will provide an idea of what did we discuss about.

During the different keynotes and panels, we used, apart from the classical hand-up, the backchan.nl tool for audience intervention during conferences. Better than a massive Twitter, IMO.

After a short introduction, we started discussing about the politics of open networks. It was pointed out that we need to come up with a clear definition of net neutrality and push the ISPs to implement the policies we want. The politics-related meetings about net neutrality often include a lot of industry representation, but seldom people from other sectors of the population, which are also affected, as university campuses or consumers’ groups.

The three-strikes law to cut the access to the Internet can become extremely harmful in contexts where it’s being used as a platform for services like VoIP or TV, disconnecting the affected user completely, unable to make even emergency calls.

In general, the existence of a competitive market of ISPs, like the one in England, contributes to the natural enforcement of the net neutrality.

Controversely, one of the panelists, Timothy B. Lee, exposed his ideas about how to preserve the net neutrality without law regulations. You can read more about his ideas, and a quite long and in-deep paper, in his blog.

I’ve recently read bad news about net neutrality in Spain. Some of the major ISPs operating in the country, like Telefónica or Vodafone, claim a monetary compensation from companies that use their infrastructures for their business, like search engines, mobile apps distributors, or VoIP companies. I personally oppose the Internet to become another TV.

Next, Pat Aufderheide directed a keynote about the concept of fair use to reuse and remix existing culture. Fair use is perfectly legal in the US and should be encouraged, even enlarging or modifying non-copying policies for homework in schools. There’s a lot of interesting material about fair use in the Center for Social Media website.

Moving on to the topic of Open Access and Access to Knowledge, it was pointed out that Open Access in public universities is low-hanging fruit and we should contact these universities to encourage them to adopt this model. The Open University Campaign, a Students For Free Culture project, contains valuable information about this. Also, the Right to Reseach Coalition is an excellent source of info about how to demand the research work paid with our taxes to come back to us without having to pay additional unfair fees.

Unfortunately, some countries lack from copyright exceptions for libraries and universities, and the changes in the law, usually promoted by the industry, always tend to make it more restrictive.

We need to take the discussion about Access to Knowledge from experts to the “family dinner”, explaining current common behaviour that is or would be illegal under the current laws, to make the public aware of how are they affected and take part in the legal discussions about copyright laws.

Also, it was pointed out that personal meetings, faxes and phone calls are much more effective than emails or Facebook campaigns to make the politicians hear our opinion.

The next panel was about Open Educational Resources (OER). Eric Frank, from Flat World Knowledge (FWK), told us that textbooks are a major portion of the tuition costs in some countries. Flat World Knowledge provides a platform to create, publish and distribute quality, peer-reviewed, customizable and flexible-licensing books that are freely accesible online and affordable as printed copies. The people from FWK have observed that, even when there’s a free printable copy of the book available, some students prefer to buy the book, which makes this publishing model presumably sustainable.

The Michigan University OER Team (Open.Michigan) is working on an impressive list of projects to enable groups and individuals to openly share their work. These are mostly collaboration tools, supporting the idea of that the knowledge is not just “transferred” from teachers to students, but something socially constructed. Some of the most interesting are dScribe, a framework to help faculty staff to gather available educational material, clear possible copyright restrictions over it (so it can be published legally under certain circumstances) and reuse it to create and publish new OERs; or OERca, the free software platform that powers most of the dScribe framework.

Finally, Timothy Vollmer, from ccLearn, told us about how CreativeCommons is helping to the development of OERs, the available tools for adequately tag content for further indexing and discovery.