Search in this blog

Wednesday, January 24, 2018

Desugaring foreign key violation errors from postgresql jdbc

When working with applications using relational databases (like PostgreSQL in this example), it is quite probably that you are using foreign key constraints (if not, you probably should) to protect the integrity of your data.

You usually validate the data before storing it but sometimes it is unavoidable to get these kind of errors once in a while while working on concurrent environments.

It is a good practice to differentiate between errors and failures, while using PostgreSQL JDBC you can't differentiate them easily because you always get a PSQLException, while I love PostgreSQL I hate this driver for this reason.

Desugaring the exception into specific errors with the related values is quite useful to decide if we can do anything, like retrying the operation, blaming the user (error) or blaming the server (failure).

I'll describe the way that I've used for some time to detect foreign key violations from the PostgreSQL driver, while the presented code it is Scala, it could be applicable to Java users as well.

Looking to the documentation you will find that PSQLException could contain a ServerErrorMessage, it contains the logic for parsing the server error into several nullable fields.

The important parts from the exception for this task are the SQLState (which just wraps a String) and the detail retrieved from the ServerErrorMessage.

The SQLState basically give us an error code where Integrity Constraint Violation errors start with "23", being "23503" the code for the Foreign Key Violation error.

When we get a Foreign Key Violation error, the detail retrieved from the ServerErrorMessage gives a message like this:
- Key (column)=(given_value) is not present in table "table".

In this case column could be user_id, given_value could be 1, and table being users.

Using this knowledge, we can create a mapper function that gives the specific error having the column name which could be useful to decide what to do.



Just to give you a real example, I have been using this approach with Anorm to return specific error messages to the user in a web application, I have defined AnormPostgresDAL which has the reusable error mapping to give me an application specific error.

Creating a fixed price alert, requires a foreign key to the users table (the alert owner) and the currencies table (the related currency), detecting if the user or the currency constraint is violated, and then map it to the proper error is now a simple task, see FixedPriceAlertPostgresDataHandler#create.



Disclaimer, while I call my error class PostgresIntegrityViolationError, at the moment it just holds Foreign Key Violation errors.







Tuesday, July 26, 2016

Scrapping Facebook Groups almost in real time (FGIR)

Today I want to introduce a project I developed some years ago which I called FGIR (Facebook Groups Information Retriever).

FGIR is a project intended to scrape and organize data from Facebook groups, the data is retrieved using email notifications from Facebook, they are received incredibly fast (this is the reason why I said almost in real time).

Note: Facebook is very picky with automated data collection(https://www.facebook.com/apps/site_scraping_tos_terms.php), so be careful.

So, why to scrape Facebook?
In my location, people started to create groups intended to buy and sell things. It got really popular, I remember to be joined in a group having around 200k people (which I consider to be a lot for a small city). There were really good opportunities to buy cheap things but it was really difficult to find them, groups were messy and so many people were adding posts and comments every second.

I decided to store that valuable information in order to organize it and get the products I wanted to buy.

I turned on email notifications on Facebook and wrote a client to listen for new emails, there are two kind of emails handled:
1.- Added to a new group.
2.- There is a new post in a group.

Each email was parsed to detect which kind of event happened, the first one was easier than the second one, anyway, I stored the retrieved information in a database model.

Also, I wrote a web application to query the database model to be able to find what I wanted (which is lost and I can't share).

I uploaded the project code (FGIR) to github, I'm surprised that it still works.

I hope that it can be useful for you as it was for me when I wrote it.

Thanks for reading.

Wednesday, May 4, 2016

Tuenti Challenge 6 Solutions

Tuenti Challenge 6 has finished, I liked it more than the previous one (which I liked it a lot).

I didn't solved all questions but half of them, here is are brief details for them.

Here is the repository containing all the codes: https://github.com/AlexITC/tuenti-challenge-6

I will update this post if I solve another task, I hope you enjoyed the contest as much as I did.


Challenge 1 - Team Lunch:
You are given an integer N representing the number of dinners, you have to output the minimum number of tables required where all dinners can sit together.

One table can have up to 4 dinners, adding one table can have up to 6, so, each time we add a table, we can sit 2 extra dinners, as N is small, we can try every possible value using linear search or binary search the answer, there is a direct formula too.

Notice that N can be 0, so we need to handle this corner case.

Binary search code: https://github.com/AlexITC/tuenti-challenge-6/blob/master/01/TeamLunch.java


Challenge 2 - The Voynich Manuscript:
You are given a list of words and need to answer many queries, each query contains two integers L and R, and you have to tell what are the three most common words in the interval [L, R] (from the list of words).

The list of words if fixed and the number of distinct words is not so high, we can store the indices where the word occurs (for each distinct word), a query can be answered using binary search on each work having O(M * logN) time complexity (M is the number of distinct words and N the total number of times a words occurs).

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/02/TheVoynichManuscript.java


Challenge 3 - YATM Microservice:
You are given something that seems a code in a unusual programming language, the idea is to understand how it works and print the required output.

There are two ways to solve this problem:
1. Program an parser and a state machine to process all the code:
This would leads to a time consuming and a little bit boring task. Also is easy to made mistakes misunderstanding something in the logic.

2. Notice that the input is a well know format called YAML, the input represents a hash table, there are a lot of parsers for this format, having this, is really easy to code a simple state machine.

I didn't notice the format and coded the first idea, here is my long java code: https://github.com/AlexITC/tuenti-challenge-6/blob/master/03/Microservice.java


Challenge 4 - Hadouken!:
You have a fixed list with 5 strings and you are given a string S, you need to count how many times a string would match as a sub string in S if the last character is missing, in other words, count the number of partial matching requiring to change the last character to get a full matching, with the condition that you will try to match the longer string first.

The statement is confusing but you can throw the two longer strings from the fixed list and count the other 3 only, the strings are small, so brute force will work.

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/04/Hadouken.java


Challenge 5 - Hangman
This was an unusual but interesting problem, indeed it was one of my favorites, you are given an IP address, a list of words and a Hangman picture.

The given address uses Telnet protocol, connecting to it gives you an interactive hangman game, typing a letter updates the interface setting the positions where the letter occurs in the word or setting an error, you are required to complete the level before the time expires, completing a level move you to the next one having less time and a longer word, the idea is to write a program to play the game fast enough to complete all levels. One important thing is that the word to guess is in the given dictionary.

My strategy to guess a word is as follows:
1. Keep in a list all the words in the dictionary having the required length of the word to guess.
2. Find the letter occurring the most times in our list.
3. If the letter was correct, find the positions where the letter occurs and filter the list with the words containing that letter in these positions only (the list is reduced), continue in 2.
4. If the letter is incorrect, filter the list with the words not containing this letter, continue in 2.

I avoid to count letters already found to not repeat the same letter twice (in step 2).

My program is semi-automatic, it require me to press enter each time a level is completed, I used this to have time to copy the keys manually.

Here is my ugly implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/05/Hangman.java


Challenge 6 - Through the Looking-Glass
This was an interesting but frustrating task, you are given something that seems a QR image with a lot of noise.

I cleaned the image (with so many effort): https://github.com/AlexITC/tuenti-challenge-6/blob/master/06/result.png

The QR contains this URL: https://en.wikipedia.org/wiki/Piet_Mondrian

The interesting thing is that there is an Esoteric Programming Language called Piet, the noisy image seems to use the colors required by Piet, there are some interpreters online which didn't worked for me, you can try it yourself, or you can download and compile an interpreter.

Note than the image should be mirrored before giving it to the interpreter.

Also, for more information on Piet Mondrian, check out his page on Artsy: https://www.artsy.net/artist/piet-mondrian


Challenge 7 - Tiles
You are given a matrix of integers representing a tile in an infinite board, you are required to find a sub rectangle in this infinite board having the maximum possible sum or say that sum if infinite.

The sum is infinite if the sum in a row or the sum in a column in the matrix is greater than zero, this is true because the row or column can be used indefinitely and the sum will increase every time.

Assuming the sum is not infinite means that the rectangle is contained in the matrix or is a cyclic rectangle (a suffix + a prefix), we can copy the matrix three times to form a bigger matrix and find the maximum sub rectangle sum (which is a well known problem), this can be done with O(N^3) time complexity extending Kadane's algorithm.

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/07/Tiles.java


Challenge 8 - Labyrinth
This was another unusual but interesting problem which I enjoyed solving it, you are given an IP address allowing socket connections, it shows you a partial maze with 7x7 tiles, and your current position, you can traverse the maze using the letters 'u', 'd', 'l', 'r', each time you move, the partial maze is scrolled revealing new tiles.

In this challenge you need to get the key which is in the borders of the maze, so you have to explore the whole maze, to make the things more interesting, once a connection is started, it will be expired after 13 minutes.

I used two algorithms, one going always to the left and the other going to the right, after each move I save the new tiles, each algorithm discovered more than half of the maze, after storing the results in a file I merged them into a single one to recover the key.

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/08/Labyrinth.java


Challenge 9 - 0-1 Immiscible numbers
You are given an integer N, and you have to find the smallest multiple of N composed by ones and zeros only, the ones should occur before zeros.

There is an interesting property I found which I can't proof: any integer X not divisible by 2 or 5 have a multiple composed of 1's only, this multiple is the smallest for our problem.

Another property: Any integer of the form 2^N or 5^N or have a multiple containing exactly N 0's.

The second property tend to be really useful, for any integer X the number of zeros is max(twos, fives), where X is divisible by 2^twos and 5^fives.

We can count the number of zeros required and reduce the number removing prime factors 2 and 5, counting ones can be done with a linear time algorithm, this is quite slow for bit cases but is enough to get the output required.

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/09/ImmiscibleNumbers.java


Challenge 12 - Pika Virus
You are given two trees and have to check they have the same topological structure, in order words, check is they are isomorphic, there are not so hard algorithms to check this in polynomial time, the constraints are small so we can try a simpler and slower approach.

If the trees are isomorphic we are required to match each node from a tree with the one lexicographically smallest node from the other tree, and the ancestors doesn't need to match. This crazy condition made me to loss several hours trying to figure out why my output was not the expected one, and as I saw in twitter, my case was not the only one.

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/12/PikaVirus.java


Challenge 14 - SpeedRunner
You are given a matrix representing a board game, there are some rules defined about the movements. In brief, you are standing in a tile and want to exit, before exit you need to grab all the keys, print the minimum amount of frames to exit breaking times with keys pressed, breaking ties with the lexicographically smallest string of keys pressed.

The maximum number of keys is 16, so we can use dynamic programming with bit masks to find the optimal order to grab the keys (see Traveling Salesman Problem), we need a function to compute the optimal path from a tile S to a tile T, removing the lexicographically constraint, a simple Dijkstra's algorithm will work.

We can compute the optimal route from any tile to the tile T (reversing the graph leads to a single Dijkstra), after this we start at S reconstructing the optimal path to reach T, in each step we choose a tile being in one of the optimal paths and keep the lexicographically smallest.

You need to take care implementing this because the code is long and can be tricky.

Implementation: https://github.com/AlexITC/tuenti-challenge-6/blob/master/14/SpeedRunner.java

Saturday, December 5, 2015

Enable Shared Folders and Clipboard in Virtualized Linux using VirtualBox

I will assume you have installed a Linux distribution in VirtualBox, to enable the shared folders and clipboard you have to follow these steps:
1- Open a terminal
2- apt-get update
3- apt-get upgrade
4- apt-get install build-essential module-assistant
5- m-a prepare
6- in VirtualBox menu bar choose 'devices -> Install Guest Additions'
7- Go to the terminal and run 'mount /media/cdrom'
8- run 'sh /media/cdrom/VBoxLinuxAdditions.run' and follow the instructions.

Notes:
- If you get errors after step 6, reboot your Linux and continue in step 6.
- You may need root access to run some commands.

After these steps you should be able to share folders and clipboard, you may need to reboot after step 8.

Sunday, October 25, 2015

Competitive Programming Resources

Competitive Programming / Programming Contest Resources for learning things

This list is going to be edited frequently when I find good resources to learn things.

Tools:
Graph drawing: http://mxwell.github.io/draw-graph/


Improving in CP


Dynamic Programming Optimization Problems:
http://codeforces.com/blog/entry/47932

Non Trivial DP tricks:
http://codeforces.com/blog/entry/47764

Slope Trick:
http://codeforces.com/blog/entry/47821

Floyd Warshall:
https://www.quora.com/Why-is-the-order-of-the-loops-in-Floyd-Warshall-algorithm-important-to-its-correctness/answer/Jayaprakash-Sundararaj

Manacher's Algorithm:
https://www.quora.com/What-is-Manachers-algorithm-How-can-we-use-it-to-find-all-palindromic-sub-strings-in-a-string

Meet in the middle:
http://qr.ae/R3GIji


Suffix Array - Linear time construction (DC3) explained:
http://qr.ae/R6EeWR

Fractional Cascading:
http://codeforces.com/blog/entry/21892
http://www.cs.uu.nl/docs/vakken/ga/slides5b.pdf (Page 43)


Bipartite Matching:
http://www.slideshare.net/KuoE0/acmicpc-bipartite-matching?next_slideshow=1


Algorithms and Data Structures Implementations:
https://sites.google.com/site/indy256/


Suffix Automaton
http://codeforces.com/blog/entry/20861

Binary Indexed Tree for Range Minimum Query
http://ioinformatics.org/oi/pdf/v9_2015_39_44.pdf

Binary Indexed Tree for coefficients in functions:
http://apps.topcoder.com/forums/?module=Thread&threadID=756271


SQRT Decomposition


Combinatorics:

Trie

Sweep Line


SPOJ Must Solve problems

Centroid Decomposition


Competitive programming courses


Matching in general graph

Matrix Power


Matrix Power alternative using DP

Treap


Splay Tree

Segment Tree

2D Segment Tree

Persistent Segment Tree

Ordered Sets



Data Structures

Map Short Permutation (11, 12 items) to integer

Segmented Sieve


Dynamic Connectivity

Aho-Corasick

Find primes in some interval


Suffix Tree

Suffix Array


Sieve Methods


2-SAT


Graphs

FFT
http://codeforces.com/blog/entry/14744 
http://codeforces.com/blog/entry/20751
http://www.cs.cmu.edu/afs/cs/academic/class/15451-s10/www/lectures/lect0423.txt

MO’s Algorithm

Game Theory

Manacher’s Algorithm

Minimum Diameter Spanning Tree

Dynamic Programming (DP)

DP with bitmasks

DP over integers

DP Optimizations



Heavy Light Decomposition
http://blog.anudeep2011.com/heavy-light-decomposition/

Ternary Search


Trie

Z-Algorithm


Nim Game
http://web.mit.edu/sp.268/www/nim.pdf


World Finals 2015 solutions

Andrew Stankevich Contests

Regional Contests with editorial

Training Advices

Saratov University Notebook


Complex Numbers in Geometry
https://uqu.edu.sa/files2/tiny_mce/plugins/filemanager/files/4041834/cnum_mr.pdf





Link-Cut Tree
http://www.cs.cmu.edu/~sleator/papers/dynamic-trees.pdf
http://www.cs.cmu.edu/~avrim/451f12/lectures/lect1009-linkcut.txt

Burnside's lemma
http://petr-mitrichev.blogspot.com/2008/11/burnsides-lemma.html