# aliquot

## < a quantity that can be divided into another a whole number of time />

One of Rosalind problems deals with finding the “majority element” of a vector, which is defined as the value that appears in more than half of the entries. Note that this is a little different from what we usually call the mode in statistics. This is a pretty simple task to run using, e.g., Python or R.

Here is a solution in Python:

from collections import Counter

def maj(lst):
""""
>>> items = [8, 7, 7, 7, 1, 7, 3, 7]
>>> print(maj(items))
7
"""
c = Counter(lst)
val, cnt = c.most_common()[0]
if cnt > len(lst)/2:
return int(val)
else:
return -1


We could probably use numpy and/or scipy as well, but let's keep it simple. I really like the Counter idiom in Python, which allows to keep a record of the item index that is being processed much like enumerate does. I missed this in Racket. It is quite easy to tabulate a list of values using a for/list pattern, as the following code illustrates:

(define (counter x lst)
(if (null? lst) 0
(if (= x (car lst))
(+ 1 (counter x (cdr lst)))
(counter x (cdr lst)))))

(define items (list 8 7 7 7 1 7 3 7))
(apply max (for/list ([i (remove-duplicates items)]) (counter i items)))


If we want to return both the item value and the associated counts, there are two options (and probably more): either we rely on a hash table, or we return a list (or a pair) of values and counts, like Python's enumerate; this could be something like replacing (counter i items) with (list i (counter i items)), but we will need to update the function used when calling apply because max will not accept a tuple of values.

Here is a little example of using a hash instead of plain lists:

;; Credit: https://stackoverflow.com/a/5741004
(define (bagify xs)
(for/fold ((ht #hash()))
((key (in-list xs)))
(hash-update ht key add1 0)))

(define (which-max ht)
(car (argmax cdr (hash->list ht))))

(define table (bagify items))
(which-max table)


This also (sort of) solves the problem of the joint use of apply and max described above:

(argmax cadr (for/list ([i (remove-duplicates items)]) (list i (counter i items))))


The same approach could obviously serve the purpose of finding the mode in a list of values, e.g.:

(define (mode xs)
(cond [(empty? xs) '()]
[else (car (argmax cdr
(hash->list (bagify xs))))]))


Note that the above procedure will not handle the case of multiple modes.

Finally, what about enumerate like idiom in Racket? Here's one solution:

(define zip (lambda (x y) (map list x y)))

(define (enumerate xs)
(zip (range (length xs)) xs))