Saturday, December 3, 2016

Don't feed the zombies!

Sometimes, the best way to help the startups you care about is to let them die from starvation.
We all know entrepreneurs with terrible ideas, seeking validation, recognition and cash. I was (am?) one of them, and looking back at my startup journey I wish that I hadn't receive the awards and investment that kept me working on the wrong idea for months (although it was a great learning experience).

If you really care about your startup friends, I encourage you to follow these recommendations:
  • Feedback / Surveys. If you are asked to provide feedback about an idea, don't be accommodating. Tell them what you really think, even (especially) if it is not nice. A good entrepreneur will be grateful to learn more about your real needs. Perhaps you are simple not part of their target market.
  • Install my app / Like my page. If you don't really like a project, why do you have to install their app, register as a user or like their Facebook page? This will inflate their traction metrics, giving the startup team false expectations.
  • Crowdfunding / FFF Investment. While you might think you are helping your friends by supporting their Kickstarter campaing or their FFF (Family-Friends-Fools) round, you are probably making them waste their time creating a product nobody wants, and losing your money at the same time.
  • Awards. Entrepreneurs need a boost to their ego from time to time. However, receiving an award for a project that is not good enough could be misunderstood as real validation, encouraging founders to keep working in the same direction. Don't support any candidatures you don't really like.

Sunday, October 23, 2016

10 things that I learned from organising a meetup


For the last 9 months I have co-organised and presented Machine Learning Dublin, a meetup group started by ADAPT Centre that rapidly grew from 0 to 1350 members. Now that I have stepped back, I want to share ten lessons that I learned through this wonderful experience.
  1. Form a great team. The effort to organise a meetup is largely underestimated: it can easily take you 30 hours per event. Make sure you have a good team of at least 4 people and assign responsibilities: finding speakers, sponsors and venues, presenting, managing website and social media, registration desk, etc.
  2. Secure great venues. The event space plays a fundamental role in the success or failure of a meetup. Secure nice-looking event spaces in the city centre that are easy to find by the attendees. Some topic-related companies, startup incubators and co-working spaces might offer their facilities for free.
  3. Be picky about speakers. Select 2 or 3 good speakers per event that engage with the audience and you will see the community grow. Make sure the topics are relevant and strictly forbid sales pitches. After a few full-house events you will have great speakers queuing to participate in future events.
  4. Don't accept any sponsor. If your first events are successful, sponsors will start queuing to host your meetup at their premises. Kindly decline offers from sponsors that don't share your values, that demand too much, that don't have suitable venues or that don't help you with the logistics.
  5. Manage attendance. This is probably the trickiest part. When you announce a new event people will sign up but many of them won't turn up. Some tips to minimise the impact:
    - Keep track of no-shows and ban them if they become recurrent no-goers.
    - Send reminders 48h before the event, so that participants can RSVP NO and allow people in the waitlist to attend.
    - Announce +20% seats above the attendees limit in order to compensate no-shows.
    - It is very common to meet attendees that don't care about the talks, and they just want to recruit, socialise or have some free pizza. Kindly remind them that the event is strictly for people that are interested in the topic.

  6. Take advantage of social media. Speakers, sponsors and organisers love to be mentioned in social media. Make sure you have a catchy hashtag that can be used during the event, and monitor reactions and feedback. Tweets from your attendees are your best PR.
  7. Engage with other meetup groups. Meetup groups on similar topics shouldn't compete but collaborate with each other. Share tips, speakers and sponsors. Promote each other and organise joint meetups if possible. It will benefit the entire community.
  8. Grow your network. As a meetup organised you might become an influencer in the local community. Make the most out of it: engage with speakers, sponsors and attendees, and find ways to boost your professional career.
  9. Enjoy the talks. You will be pretty busy during the event. However, find time to enjoy the presentations and learn from the great speakers you brought.
  10. All the effort pays off. It doesn't matter how painful organising a meetup sometimes can be, it always pays off! Personal satisfaction, knowledge, connections, recognition,... and sometimes with the most valuable and unexpected rewards.

Wednesday, September 28, 2016

Website Load Testing with Apache JMeter

I was exploring different load testing tools in order to make sure that the Ethics Canvas platform would resist peaks of up to 100 users signing in every 10 seconds, when I came across Apache JMeter.

JMeter is a robust Java-based open-source tool that allows users to test performance both on webservices, web dynamic languages, Java Objects, databases and queries, FTP servers, etc.


Running JMeter (Linux):
  1. Download Apache JMeter
  2. Extract the content from the compressed file
  3. From the Terminal, navigate into the apache-jmeter/bin
  4. Type ./jmeter and it will open the Java app (make sure you have Java installed)


Running load tests on Ethics Canvas:
  1. Right-click on Test Plan -> Add -> Threads -> Thread Group
  2. On Number of Threads we enter the desired number of users: 100
  3. On Ramp-Up Period we enter the desired test duration: 10
  4. Right-click on Thread Group -> Add -> Sampler -> HTTP Request
  5. On Server Name we type ethicscanvas.org
  6. On Path we type our sign-in file: /php/log-in.php
  7. On Send Parameter we add the email and the password
  8. Right-click on Thread Group -> Add -> Listener -> View Results in Table
  9. Click PLAY and the test will start, populating the table with the results


Understanding the results:
The most important factors are the average sample time (response time), deviation (from the average response time) and the status of all the responses (no errors). In our case, with an average of 2.2289 seconds from a mobile data connection, we can say we have passed the test.

It is worth looking for unusual values in latency and connect times. Always start with light-weight tests, and gradually increase the number of threads and reduce ramp-up period to know your limits. You might want to try as well from different Internet connections (mobile, wifi, remote,...) and never from localhost (unless it's only used internally).

Monday, September 26, 2016

The Ethics Canvas

In 2008, Alexander Osterwalder presented an innovative tool called "Business Model Canvas" (BMC) that aimed to help entrepreneurs to capture the fundamental business knowledge about their project, and bring about pivots in order to make the business model more consistent and successful. Since then, the BMC has helped over 5 million entrepreneurs increase the value that they provide to their users, and find a sustainable model.
In 2015, a group of researchers from ADAPT Centre started using a similar approach in order to detect at early stage all the ethical implications of a project, and help entrepreneurs and researchers pivot their idea in order to minimise these issues.


If you think about new technologies such as biotech, AI, IoT, VR, biometrics, blockchain, 3D printing,... they all bring great advancements for humanity, but they have some potential ethical issues that could have a catastrophic impact.

After some months of hard work and experiments, we have released this open-source brainstorming tool that we have called The Ethics Canvas. Similarly to the BMC, it enables participants to think about those ethical impacts while collaboratively completing this 12-box of the canvas. There are printed and web versions available, and it is licensed under Creative-Commons Non-Commercial.

SECTIONS OF THE ETHICS CANVAS:
  1. Individuals Affected. Identify the types or cathegories of individuals affected by the product or service, such as men/women, user/non-user, age-category, etc.
  2. Organisations and Groups Affected. Identify the collectives or communities, e.g. groups or organisations, that can be affected by your product or service, such as environmental and religious groups, unions, professional bodies, competing companies and government agencies, considering any interest they might have in the effects of the product or service.
  3. Products and Services provided. Name the different types of products and services that your project will provide.
  4. Resources needed. Capture the consumption of energy, raw materials, human resources, financial capital, social capital (trust, tolerance,...), marketing capital (reputation, brand,...), privacy and personal data needed by your product or service.
  5. Changes in Individual Behaviour. Name problematic differences in individual behaviour such as differences in habits, time-schedules, choice of activities, etc.
  6. Changes in Individual Relations. Name problematic changes in relations between individuals, such ways of communication, frequency of interpersonal contact etc.
  7. Organisation or Group Interests. Identify relevant ethical interests that other organisations or groups might have in your project; such as environmental, privacy, justice interests.
  8. Public Sphere. Discuss how the general perception of somebody’s role in society can be affected by the project, e.g. people behaving more individualistic or collectivist, people behaving more or less materialistic.
  9. Impact of product or service failure. Capture the potential negative impact of your product or service failing to operated as intended, e.g. technical or human error, financial failure/receivership/acquisition, security breach, data loss, etc.
  10. Impact of resource consumption. Capture possible negative impacts of the consumption of resources of your project, e.g. climate impacts, privacy impacts, employment impacts etc.
  11. Social Conflicts. Capture possible social conflicts that could be caused by the project, such as labour conflicts, minority/majority conflicts, ethnic conflicts, etc.
  12. Resolving ethical impacts. Select the four most important ethical impacts you discussed. Identify ways of solving these impacts by changing your project’s product/service design, organisation or by providing recommendations.

If you have a research or entrepreneurial project, I kindly invite you to use the Ethics Canvas with your team in order to detect the ethical impacts at early stage. We are always looking for feedback, so please, let me know what do you think.

Monday, August 22, 2016

How to prevent directory listing in Apache 2

If you are using Apache 2 as a web server on Linux, you might have seen that directory listing is enabled by default.

Directory listing allows anyone to view all the files and subfolders contained in a specific folder, by entering the URL on the browser. The image below is an example of the directory listing that you could see when you visit the fictitious URL www.example.com/images.



This might affect some privacy protection rules. In order to disable directory listing, you just need to type the following command from the terminal:

sudo a2dismod autoindex

Thursday, July 14, 2016

Configure your .com website on Apache Server

If you want to run your website from an Apache2 server and make it available from yourdomain.com, you can follow these steps:

1) Go to /var/www/ and copy the root directory of yourproject (or clone the repository from Git).
2) Create a new file on /etc/apache2/sites-available/ named yourdomain.com.conf with this content:
    <VirtualHost *:80>
        DocumentRoot "/var/www/yourproject"
        ServerName www.yourdomain.com
        ServerAlias yourdomain.com
    </VirtualHost>
3) Activate the website with the command: sudo a2ensite yourdomain.com
4) Find your public IP with the command: sudo ifconfig (it should appear as innet_addr)
5) From your domain management site, edit the A DNS record for '@', pointing now to your public IP address
6) From your domain management site, edit the CNAME DNS record for 'www', pointing now too '@'
7) Wait a couple of hours until the DNS changes are propagated. Your website should be available from yourdomain.com and www.yourdomain.com.

Monday, July 11, 2016

How to allow MySQL remote connections

If you have a MySQL database running on your server, and you want it to accept remote connections, you have to follow these steps:

1) Allow MySQL to listen to all interfaces (default is the loopback interface only):

sudo vim /etc/mysql/my.cnf
You replace this statement:
bind-address            = 127.0.0.1
by this one:
bind-address            = 0.0.0.0

2) Create a new database use with all privileges granted (you shouldn't allow remote connections with the root user):

mysql -u root -p -h 0.0.0.0
mysql> CREATE USER 'yourusername'@'localhost' IDENTIFIED BY 'yourpassword';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'yourusername'@'localhost' WITH GRANT OPTION;
mysql> CREATE USER 'yourusername'@'%' IDENTIFIED BY 'yourpassword';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'yourusername'@'%' WITH GRANT OPTION;

3) Test the connection from a different network (make sure you don't have any proxy limitations):

mysql -u yourusername -p -h your.server.address

Sunday, July 10, 2016

Big Data and Ethics

Big Data is not precisely a new trend, but the latest advances in computing capacity have set the stage for its rise. The hype and the reality of these new developments raise ethical issues that demand deliberation.


I came across an interesting white paper titled Perspectives of Big Data, Ethics, and Society, by the Council for Big Data, Ethics, and Society, that raises concerns about the obsolescence of the Common Rule (rule of ethics regarding research involving human subjects).

The Common Rule assumes that research methods using existing public datasets have no risk to individual human subjects. However, new data science techniques can create composite pictures of persons from different datasets that might be innocuous on their own but produce highly sensitive personal insights when combined. Since the informed consent occurs at the point of collection, before any data is used, it is not always possible to explain to the subject all the risks that the uses of his data might have with the current and future data analytics techniques.

In addition, the Common Rule protects individuals but it doesn't track the harms affecting communities when data is aggregated.

The Council offers the following recommendations:
  • Ensure the Common Rule clearly addresses regulation of data science. Ethics regulations should focus on what will be or could be done with datasets.
  • Seek ways to facilitate new approaches to ethics review inside academia and industry. Try new approaches that consider potential group harms in addition to individual harms.
  • Develop mechanism of ethical assessment calibrated to the practices of big data. Expand the analysis of the ethical implications of a system throughout the entire development and usage lifecycle (which is typically different in industry and academia).
  • Create and distribute high quality data ethics case studies that address difficulties faced by data scientists and practitioners. Case studies are a valuable pedagogical resource because they facilitate collaborative discussion.
  • Develop and support data science curricula with integrative approaches to ethics education. Ethics needs to be a cornerstone of big data education.
  • Strengthen ethics-oriented activities within professional associations. Ensure ethical commitments in research and practice at the professional association level.
  • Create hybrid spaces for ethics engagement. Treat networking and collaboration as necessary components of establishing ethics capacity.
  • Build models of internal and external ethics regulation bodies in industry. Without internal, external or legal repercussions, voluntary ethics review mechanisms could be difficult to enforce.
  • Set standards for responsible cross-sector data sharing. 
In this white paper the authors identify some challenging questions for future work, such as how to account for the risk of sharing datasets when we cannot know what auxiliary datasets they will be combined with in the future.

Wednesday, May 11, 2016

How to enable command autocomplete and history on Linux

After installing some Linux distributions such as Debian, you might find that the command autocomplete and command history are not enabled by default, so that you cannot use TAB or the cursor movement keys from the command line.

What is actually happening is that the shell you have in some distributions by default is sh, which is tedious to use. However, you can easily switch to a superset called bash, which incorporates these two features that will save you a lot of time. This is how you do it.

1) Type the command chsh
2) Type your password
3) Type as login shell: /bin/bash
4) Logout and sign in again. Autocomplete and command history should be enabled now.

Wednesday, April 6, 2016

How to jump to time offsets in HTML5 video

Let's say that you have a 30-minute WEBM video file, from which you just want to play the following video segments, jumping from one to the other automatically without interruptions:
  1. [00:01:25.00 - 00:02:25.00] -> from second 85 to 145
  2. [00:11:40.00 - 00:11:55.00] -> from second 700 to 715
  3. [00:20:26.00 - 00:21:07.00] -> from second 1226 to 1267
  4. [00:26:11.00 - 00:28:01.00] -> from second 1571 to 1681

To increase the complexity, let's think that you have these video segments in a PHP variable $arrayVideoSegments (normally the case if they were retrieved from the database).

  $arrayVideoSegments[0]->startTime = 85
  $arrayVideoSegments[0]->endTime = 145
  $arrayVideoSegments[1]->startTime = 700
  $arrayVideoSegments[1]->endTime = 715
  $arrayVideoSegments[2]->startTime = 1226
  $arrayVideoSegments[2]->endTime = 1267
  $arrayVideoSegments[3]->startTime = 1571
  $arrayVideoSegments[3]->endTime = 1681
The following code will play these video segments one after the other in an HTML5 Video object, using some Javascript and PHP.

<!-- BEGIN - HTML5 Video Object -->
<video id="myVideoPlayer" controls preload="metadata"">
  <source src="myVideo.webm" type="video/webm">
</video>

<!-- END - HTML5 Video Object -->


<!-- BEGIN - Migration from PHP Array to Javascript array -->
<script>
  <?php
    // Encode this array in a Javascript-friendly array
    $jsArrayVideoSegments = json_encode($arrayVideoSegments);
    // Declare this array in Javascript from PHP
    echo "var videoSegments = ". $jsArrayVideoSegments . ";\n";
  ?>
  playVideoSegments(videoSegments);
</script><!-- END - Migration from PHP Array to Javascript array -->

<!-- BEGIN - Video segment jump controller -->
<script language="javascript">
  function playVideoSegments(videoSegments) {
    var currentSegment = 0; // Segment being played
    var endTime = videoSegments[currentSegment]['endTime'];
    var videoPlayer = document.getElementById('myVideoPlayer');
    videoPlayer.currentTime = videoSegments[currentSegment]['startTime'];
    videoPlayer.play(); // Starts playing the video from startTime
    // This listener checks the video current time every 250ms    videoPlayer.addEventListener("timeupdate", function() {

      if(videoPlayer.currentTime >= endTime) {
        // Segment completed
        currentSegment++;
        if(currentSegment < videoSegments.length) { 
          // Not the last segment in the array
          videoPlayer.currentTime = videoSegments[currentSegment]['startTime'];
          endTime = videoSegments[currentSegment]['endTime'];
        } 
        else {
          // Last segment in the array is over
          videoPlayer.pause();
        }
      }
    }, false);
  }
</script>
<!-- END - Video segment jump controller -->

Monday, February 8, 2016

How to install a SSL certificate in Tomcat 8

If you want to run your web app on Tomcat 8 (Linux) under the HTTPS umbrella, these are the steps that you need to follow. In this example we will use test domain example.com:

1) Purchase an SSL certificate from trusted provider. Price ranges from $5 to over $100 per year.

2) SSH to your Linux server and, from your personal directory /home/youruser/ type the following command in order to generate the private key:
keytool -genkey -alias tomcat -keyalg RSA -keystore example.keystore

3) You will be asked some questions. The most important ones are the keystore password (let's assume it is yourPassword) and the First and last names, which is actually misleading because you need to enter the domain name: example.com.

4) Generate your local Certificated Signing Request (CSR) with this command:
keytool -certreq -keyalg RSA -alias tomcat -file example.csr -keystore example.keystore

5) Open the CSR file that you have just generated with vim example.csr, select all the content and copy it to the clipboard.


6) Paste the CSR from your clipboard in your trusted provider's website in order to issue the SSL certificate.

7) You will receive an email with a root certificate, at least one intermediate certificate and a signed certificate in CRT format.

8) Transfer all these files to the /home/youruser/ directory in the server using a FTP tool such as Filezilla.

9) Import the root certificate:
keytool -import -alias root -keystore example.keystore -trustcacerts -file yourRootCertificate.crt

10) Import each intermediate certificate (with a different alias):
keytool -import -alias intermediate1 -keystore example.keystore -trustcacerts -file yourFirstIntermediateCert.crt
keytool -import -alias intermediate2 -keystore
example.keystore -trustcacerts -file yourSecondIntermediateCert.crt

11) Import your signed certificate:
keytool -import -alias tomcat -keystore example.keystore -file yourSignedCert.crt

12) From the tomcat8/conf/server.xml file, add or edit the following code:
<Connector port="8443" maxThreads="150" scheme="https" secure="true"
           SSLEnabled="true" keystoreFile="/home/youruser/example.keystore"                   keystorePass="yourPassword" clientAuth="false" 
           keyAlias="tomcat" sslProtocol="TLS"/>

13) If you want to force the HTTPS to be loaded at all times, change the redirect of port 8080 in server.xml to 443. And add the following code in web.xml inside the <web-app>:
<!-- Require HTTPS for everything except /img (favicon) and /css. -->
  <security-constraint>
    <web-resource-collection>
      <web-resource-name>HTTPSOnly</web-resource-name>
      <url-pattern>/*</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
      <transport-guarantee>CONFIDENTIAL</transport-guarantee>
    </user-data-constraint>
  </security-constraint>
  <security-constraint>
    <web-resource-collection>
      <web-resource-name>HTTPSOrHTTP</web-resource-name>
      <url-pattern>*.ico</url-pattern>
      <url-pattern>/img/*</url-pattern>
      <url-pattern>/css/*</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
      <transport-guarantee>NONE</transport-guarantee>
    </user-data-constraint>
  </security-constraint>


14) If you are using a hosting provider such as Microsoft Azure, you will need to map the private and public ports in the virtual machine and upload the CRT certificates in the cloud service, as per below :

15) Your service should be available now on https://example.com, and automatically redirected to the HTTPS version from http://example.com.