sorting algorithm

64 results back to index


pages: 120 words: 17,504

Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked by Vibrant Publishers

NP-complete, sorting algorithm, traveling salesman

O(n2) algorithms such as selection sort or bubble sort; the best case (nearly sorted input) is O(n) e) stable - does not change the relative order of elements with equal keys f) in-place - only requires a constant amount O( 1) of additional memory space g) online - can sort a list as it receives it 179: What search algorithm does straight insertion sorting algorithm use? Answer: Straight insertion sorting uses a linear search algorithm to locate the position at which the next element is to be inserted. 180: What is the difference between insertion and exchange sorting? Answer: In insertion sorting algorithms, insertion is performed into a sorted list. On the other hand, an exchange sort does not necessarily make use of such a sorted list. 181: Give examples of exchange sorting algorithms: Answer: The most well known exchange sorting algorithms are the following: a) Bubble Sort - simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order b) Quicksort - a divide-and-conquer style algorithm which divides the original list into two sub-lists and sorts recursively each list 182: Shortly describe the quicksort algorithm.

The basic steps to perform are the following: a) divide the sequence into two sequences of length b) recursively sort each of the two subsequences c) merge the sorted subsequences to obtain the final result 187: What is distribution sorting? Give example of several distribution sorting algorithms. Answer: It is a sorting algorithm which has the unique characteristic that it does not make use of comparisons to do the sorting. Instead, distribution sorting algorithms rely on a priori knowledge about the universal set from which the elements to be sorted are drawn. The most well-known distribution sorting algorithms are bucket sort and radix sort. 188: Describe the adaptive heap sort algorithm. Answer: It is a variant of heapsort that uses a randomized binary search tree to structure the input according to any preexisting order.

171: Describe on short the branch and bound algorithm. 172: Describe divide and conquer paradigm. 173: Describe dynamic programming paradigm. 174: How does a simulated annealing algorithm work? 175: Provide a short description of Euclid's algorithm. Provide its recursive description: Sorting Algorithms 176: Which are the sorting algorithms categories? 177: Describe on short an insertion sorting algorithm. 178: Which are the advantages provided by insertion sort? 179: What search algorithm does straight insertion sorting algorithm use? 180: What is the difference between insertion and exchange sorting? 181: Give examples of exchange sorting algorithms: 182: Shortly describe the quicksort algorithm. 183: Describe several variations of quicksort algorithm. 184: What is the difference between selection and insertion sorting?


Algorithms Unlocked by Thomas H. Cormen

bioinformatics, Donald Knuth, knapsack problem, NP-complete, optical character recognition, P = NP, Silicon Valley, sorting algorithm, traveling salesman

We define a comparison sort as any sorting algorithm that determines the sorted order only by comparing pairs of elements. The four sorting algorithms from the previous chapter are comparison sorts, but R EALLY-S IMPLE -S ORT is not. 62 Chapter 4: A Lower Bound for Sorting and How to Beat It Here’s the lower bound: In the worst case, any comparison sorting algorithm for n elements requires .n lg n/ comparisons between pairs of elements. Recall that -notation gives a lower bound, so that what we’re saying is “for sufficiently large n, any comparison sorting algorithm requires at least cn lg n comparisons in the worst case, for some constant c.”

I meant .n/ time, since it makes sense that we have to examine each element, even if we’re not comparing pairs of elements. The second important thing is truly remarkable: this lower bound does not depend on the particular algorithm, as long as it’s a comparison sorting algorithm. The lower bound applies to every comparison sorting algorithm, no matter how simple or complex. The lower bound applies to comparison sorting algorithms that have already been invented or will be invented in the future. It even applies to comparison sorting algorithms that will never be discovered by mankind! Beating the lower bound with counting sort We’ve already seen how to beat the lower bound in a highly restricted setting: there are only two possible values for the sort keys, and each element consists of only a sort key, with no satellite data.

How, then, do we get the array to be sorted in the first place? In this chapter, we’ll see four algorithms—selection sort, insertion sort, merge sort, and quicksort—to sort an array, applying each of these algorithms to our bookshelf example. Each sorting algorithm will have its advantages and its disadvantages, and at the end of the chapter we’ll review and compare these sorting algorithms. All of the sorting algorithms that we’ll see in this chapter take either ‚.n2 / or ‚.n lg n/ time in the worst case. Therefore, if you were going to perform only a few searches, you’d be better off just running linear search. But if you were going to search many times, you might be better off first sorting the array and then searching by running binary search.


The Art of Computer Programming: Sorting and Searching by Donald Ervin Knuth

card file, Charles Babbage, Claude Shannon: information theory, complexity theory, correlation coefficient, Donald Knuth, double entry bookkeeping, Eratosthenes, Fermat's Last Theorem, G4S, information retrieval, iterative process, John von Neumann, linked data, locality of reference, Menlo Park, Norbert Wiener, NP-complete, p-value, Paul Erdős, RAND corporation, refrigerator car, sorting algorithm, Vilfredo Pareto, Yogi Berra, Zipf's Law

The fastest general algorithm we have considered that sorts keys in a stable manner is the list merge sort (Algorithm 5.2.4L), but it does not use minimum storage. In fact, the only stable minimum-storage sorting algorithms we have seen are fl(N2) methods (straight insertion, bubble sorting, and a variant of straight selection). Design a stable minimum-storage sorting algorithm that needs only O(N (log NJ) units of time in its worst case. [Hint: It is possible to do stable minimum-storage merging in O(N log N) units of time.] 5.5 SUMMARY, HISTORY, AND BIBLIOGRAPHY 391 > 4. [28] A sorting algorithm is called parsimonious if it makes decisions entirely by comparing keys, and if it never makes a comparison whose outcome could have been predicted from the results of previous comparisons.

[M24] Let gMN{z) = YlVMNkZk, where PMNk is the probability that exactly k empty piles are present after a random radix-sort pass puts N elements into M piles. a) Show that gM(N+i)(z) = gMN(z) + (A - z)/M)g'MN(z). b) Use this relation to find simple expressions for the mean and variance of this probability distribution, as a function of M and N. 7. [20] Discuss the similarities and differences between Algorithm R and radix ex- exchange sorting (Algorithm 5.2.2R). > 8. [20] The radix-sorting algorithms discussed in the text assume that all keys being sorted are nonnegative. What changes should be made to the algorithms when the keys are numbers expressed in two's complement or ones' complement notation? 9. [20] Continuing exercise 8, what changes should be made to the algorithms when the keys are numbers expressed in signed-magnitude notation? 178 SORTING 5.2.5 10. [30] Design an efficient most-significant-digit-first radix-sorting algorithm that uses linked memory. (As the size of the subfiles decreases, it is wise to decrease M, and to use a nonradix method on the really short subfiles.) 11. [16] The sixteen input numbers shown in Table 1 start with 41 inversions; after sorting is complete, of course, there are no inversions remaining.

., N}; then the record containing K can simply be placed in location TABLE + K. Both of these techniques were used to elimi- eliminate searching from the topological sorting algorithm discussed in Section 2.2.3. However, searches would have been necessary if the objects in the topological sorting algorithm had been given symbolic names instead of numbers. Efficient algorithms for searching turn out to be quite important in practice. Search methods can be classified in several ways. We might divide them into internal versus external searching, just as we divided the sorting algorithms of Chapter 5 into internal versus external sorting. Or we might divide search methods into static versus dynamic searching, where "static" means that the contents of the table are essentially unchanging (so that it is important to min- minimize the search time without regard for the time required to set up the table), and "dynamic" means that the table is subject to frequent insertions and perhaps also deletions.


pages: 752 words: 131,533

Python for Data Analysis by Wes McKinney

Alignment Problem, backtesting, Bear Stearns, cognitive dissonance, crowdsourcing, data science, Debian, duck typing, Firefox, functional programming, Google Chrome, Guido van Rossum, index card, machine readable, random walk, recommendation engine, revision control, sentiment analysis, Sharpe ratio, side project, sorting algorithm, statistical model, type inference

, Structured Array Manipulations: numpy.lib.recfunctions–Structured Array Manipulations: numpy.lib.recfunctions, More About Sorting–numpy.searchsorted: Finding elements in a Sorted Array, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort, Alternate Sort Algorithms–Alternate Sort Algorithms, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array, NumPy Matrix Class–NumPy Matrix Class, Advanced Array Input and Output–HDF5 and Other Array Storage Options, Performance Tips–Other Speed Options: Cython, f2py, C, The Importance of Contiguous Memory–The Importance of Contiguous Memory, Other Speed Options: Cython, f2py, C–Other Speed Options: Cython, f2py, C arrays in, Advanced Array Manipulation–Fancy Indexing Equivalents: Take and Put, Reshaping Arrays–Reshaping Arrays, C versus Fortran Order–C versus Fortran Order, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_, Stacking helpers: r_ and c_–Stacking helpers: r_ and c_, Stacking helpers: r_ and c_–Stacking helpers: r_ and c_, Repeating Elements: Tile and Repeat–Repeating Elements: Tile and Repeat, Fancy Indexing Equivalents: Take and Put–Fancy Indexing Equivalents: Take and Put, Advanced Array Input and Output–HDF5 and Other Array Storage Options concatenating, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_ c_ object, Stacking helpers: r_ and c_–Stacking helpers: r_ and c_ layout of in memory, C versus Fortran Order–C versus Fortran Order replicating, Repeating Elements: Tile and Repeat–Repeating Elements: Tile and Repeat reshaping, Reshaping Arrays–Reshaping Arrays r_ object, Stacking helpers: r_ and c_–Stacking helpers: r_ and c_ saving to file, Advanced Array Input and Output–HDF5 and Other Array Storage Options splitting, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_ subsets for, Fancy Indexing Equivalents: Take and Put–Fancy Indexing Equivalents: Take and Put broadcasting, Broadcasting–Setting Array Values by Broadcasting, Broadcasting Over Other Axes–Broadcasting Over Other Axes, Setting Array Values by Broadcasting–Setting Array Values by Broadcasting over other axes, Broadcasting Over Other Axes–Broadcasting Over Other Axes setting array values by, Setting Array Values by Broadcasting–Setting Array Values by Broadcasting data processing using, Expressing Conditional Logic as Array Operations–Expressing Conditional Logic as Array Operations where function, Expressing Conditional Logic as Array Operations–Expressing Conditional Logic as Array Operations data processing using arrays, Data Processing Using Arrays–Unique and Other Set Logic, Expressing Conditional Logic as Array Operations–Expressing Conditional Logic as Array Operations, Mathematical and Statistical Methods–Mathematical and Statistical Methods, Methods for Boolean Arrays–Methods for Boolean Arrays, Sorting–Sorting, Unique and Other Set Logic–Unique and Other Set Logic conditional logic as array operation, Expressing Conditional Logic as Array Operations–Expressing Conditional Logic as Array Operations methods for boolean arrays, Methods for Boolean Arrays–Methods for Boolean Arrays sorting arrays, Sorting–Sorting statistical methods, Mathematical and Statistical Methods–Mathematical and Statistical Methods unique function, Unique and Other Set Logic–Unique and Other Set Logic data types for, ndarray Object Internals–NumPy dtype Hierarchy file input and output with arrays, File Input and Output with Arrays–Saving and Loading Text Files, Storing Arrays on Disk in Binary Format–Storing Arrays on Disk in Binary Format, Saving and Loading Text Files–Saving and Loading Text Files saving and loading text files, Saving and Loading Text Files–Saving and Loading Text Files storing on disk in binary format, Storing Arrays on Disk in Binary Format–Storing Arrays on Disk in Binary Format linear algebra, Linear Algebra–Linear Algebra matrix operations in, NumPy Matrix Class–NumPy Matrix Class ndarray arrays, The NumPy ndarray: A Multidimensional Array Object, Creating ndarrays–Creating ndarrays, Data Types for ndarrays–Data Types for ndarrays, Operations between Arrays and Scalars–Operations between Arrays and Scalars, Basic Indexing and Slicing–Indexing with slices, Basic Indexing and Slicing–Indexing with slices, Boolean Indexing–Boolean Indexing, Fancy Indexing–Fancy Indexing, Transposing Arrays and Swapping Axes–Transposing Arrays and Swapping Axes, Transposing Arrays and Swapping Axes–Transposing Arrays and Swapping Axes Boolean indexing, Boolean Indexing–Boolean Indexing creating, Creating ndarrays–Creating ndarrays data types for, Data Types for ndarrays–Data Types for ndarrays fancy indexing, Fancy Indexing–Fancy Indexing indexes for, Basic Indexing and Slicing–Indexing with slices operations between arrays, Operations between Arrays and Scalars–Operations between Arrays and Scalars slicing arrays, Basic Indexing and Slicing–Indexing with slices swapping axes in, Transposing Arrays and Swapping Axes–Transposing Arrays and Swapping Axes transposing, Transposing Arrays and Swapping Axes–Transposing Arrays and Swapping Axes numpy-discussion (mailing list), Community and Conferences performance of, Performance Tips–Other Speed Options: Cython, f2py, C, The Importance of Contiguous Memory–The Importance of Contiguous Memory, Other Speed Options: Cython, f2py, C–Other Speed Options: Cython, f2py, C contiguous memory, The Importance of Contiguous Memory–The Importance of Contiguous Memory Cython project, Other Speed Options: Cython, f2py, C–Other Speed Options: Cython, f2py, C random number generation, Random Number Generation–Random Number Generation random walks example, Example: Random Walks–Simulating Many Random Walks at Once sorting, More About Sorting–numpy.searchsorted: Finding elements in a Sorted Array, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort, Alternate Sort Algorithms–Alternate Sort Algorithms, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array algorithms for, Alternate Sort Algorithms–Alternate Sort Algorithms finding elements in sorted array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array indirect sorts, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort structured arrays in, Structured and Record Arrays–Structured Array Manipulations: numpy.lib.recfunctions, Nested dtypes and Multidimensional Fields–Nested dtypes and Multidimensional Fields, Why Use Structured Arrays?

search method, Regular expressions, Regular expressions, Regular expressions, Regular expressions searchsorted method, numpy.searchsorted: Finding elements in a Sorted Array, numpy.searchsorted: Finding elements in a Sorted Array, numpy.searchsorted: Finding elements in a Sorted Array seed function, Random Number Generation seek method, Files and the operating system semantics, Language Semantics–Mutable and immutable objects, Indentation, not braces–Indentation, not braces, Everything is an object–Everything is an object, Comments–Comments, Function and object method calls–Function and object method calls, Function and object method calls–Function and object method calls, Variables and pass-by-reference–Variables and pass-by-reference, Variables and pass-by-reference–Variables and pass-by-reference, Dynamic references, strong types–Dynamic references, strong types, Attributes and methods–Attributes and methods, “Duck” typing–“Duck” typing, Imports–Imports, Binary operators and comparisons–Binary operators and comparisons, Strictness versus laziness–Strictness versus laziness, Mutable and immutable objects–Mutable and immutable objects attributes in, Attributes and methods–Attributes and methods comments in, Comments–Comments “duck” typing, “Duck” typing–“Duck” typing functions in, Function and object method calls–Function and object method calls import directive, Imports–Imports indentation, Indentation, not braces–Indentation, not braces methods in, Function and object method calls–Function and object method calls mutable objects in, Mutable and immutable objects–Mutable and immutable objects object model, Everything is an object–Everything is an object operators for, Binary operators and comparisons–Binary operators and comparisons references in, Variables and pass-by-reference–Variables and pass-by-reference strict evaluation, Strictness versus laziness–Strictness versus laziness strongly-typed language, Dynamic references, strong types–Dynamic references, strong types variables in, Variables and pass-by-reference–Variables and pass-by-reference semicolons, Indentation, not braces sentinels, Handling Missing Data, Reading and Writing Data in Text Format sep argument, Reading and Writing Data in Text Format sequence functions, Built-in Sequence Functions–reversed, enumerate–enumerate, sorted–sorted, zip–zip, reversed–reversed enumerate function, enumerate–enumerate reversed function, reversed–reversed sorted function, sorted–sorted zip function, zip–zip Series data structure, Introduction to pandas Data Structures, Series–Series, Operations between DataFrame and Series–Operations between DataFrame and Series, Grouping with Dicts and Series–Grouping with Dicts and Series arithmetic operations between DataFrame and, Operations between DataFrame and Series–Operations between DataFrame and Series grouping with, Grouping with Dicts and Series–Grouping with Dicts and Series set comprehensions, List, Set, and Dict Comprehensions–Nested list comprehensions set function, Set setattr function, Attributes and methods setdefault method, Default values setdiff1d method, Unique and Other Set Logic sets/set comprehensions, Set–Set setxor1d method, Unique and Other Set Logic set_index function, Using a DataFrame’s Columns, Using a DataFrame’s Columns set_index method, Pivoting “long” to “wide” Format set_title method, Setting the title, axis labels, ticks, and ticklabels set_trace function, Other ways to make use of the debugger, Other ways to make use of the debugger, Other ways to make use of the debugger set_value method, Indexing, selection, and filtering set_xlabel method, Setting the title, axis labels, ticks, and ticklabels set_xlim method, Ticks, Labels, and Legends set_xticklabels method, Setting the title, axis labels, ticks, and ticklabels set_xticks method, Setting the title, axis labels, ticks, and ticklabels shapefiles, Plotting Maps: Visualizing Haiti Earthquake Crisis Data shapes, The NumPy ndarray: A Multidimensional Array Object, ndarray Object Internals sharex option, Figures and Subplots, Line Plots sharey option, Figures and Subplots, Line Plots shell commands in IPython, Shell Commands and Aliases–Shell Commands and Aliases shifting in time series data, Shifting (Leading and Lagging) Data–Shifting dates with offsets shortcuts, keyboard, Keyboard Shortcuts–Keyboard Shortcuts, Keyboard Shortcuts, Keyboard Shortcuts, Keyboard Shortcuts for deleting text, Keyboard Shortcuts for IPython, Keyboard Shortcuts–Keyboard Shortcuts shuffle function, Random Number Generation sign function, Universal Functions: Fast Element-wise Array Functions, Detecting and Filtering Outliers signal frontier analysis, Signal Frontier Analysis–Signal Frontier Analysis sin function, Universal Functions: Fast Element-wise Array Functions sinh function, Universal Functions: Fast Element-wise Array Functions size method, GroupBy Mechanics skew method, Summarizing and Computing Descriptive Statistics skipinitialspace option, Manually Working with Delimited Formats skipna method, Summarizing and Computing Descriptive Statistics skipna option, Summarizing and Computing Descriptive Statistics skiprows argument, Reading and Writing Data in Text Format skip_footer argument, Reading and Writing Data in Text Format slice method, Vectorized string functions in pandas slicing, Basic Indexing and Slicing–Indexing with slices, Slicing–Slicing arrays, Basic Indexing and Slicing–Indexing with slices lists, Slicing–Slicing Social Security Administration (SSA), US Baby Names 1880-2010 solve function, Linear Algebra sort argument, Database-style DataFrame Merges sort method, Sorting, More About Sorting, More About Sorting, Sorting, Anonymous (lambda) Functions sorted function, sorted–sorted, sorted, sorted sorting, Sorting–Sorting, Sorting and ranking–Sorting and ranking, Reordering and Sorting Levels–Reordering and Sorting Levels, More About Sorting–numpy.searchsorted: Finding elements in a Sorted Array, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort, Alternate Sort Algorithms–Alternate Sort Algorithms, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array, Sorting–Sorting arrays, Sorting–Sorting finding elements in sorted array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array in NumPy, More About Sorting–numpy.searchsorted: Finding elements in a Sorted Array, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort, Alternate Sort Algorithms–Alternate Sort Algorithms, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array algorithms for, Alternate Sort Algorithms–Alternate Sort Algorithms finding elements in sorted array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array indirect sorts, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort in pandas, Sorting and ranking–Sorting and ranking levels, Reordering and Sorting Levels–Reordering and Sorting Levels lists, Sorting–Sorting sortlevel function, Reordering and Sorting Levels, Reordering and Sorting Levels sort_columns argument, Line Plots sort_index method, Sorting and ranking, Reordering and Sorting Levels, Indirect Sorts: argsort and lexsort spaces, structuring code with, Indentation, not braces–Indentation, not braces spacing around subplots, Adjusting the spacing around subplots–Adjusting the spacing around subplots span, Exponentially-weighted functions specialized frequencies, Operations with Time Series of Different Frequencies–Using periods instead of timestamps data munging for, Operations with Time Series of Different Frequencies–Using periods instead of timestamps split method, Manually Working with Delimited Formats, String Object Methods, Regular expressions, Vectorized string functions in pandas, Concatenating and Splitting Arrays split-apply-combine, GroupBy Mechanics splitting arrays, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_ SQL databases, Interacting with Databases sql module, Interacting with Databases SQLite databases, Interacting with Databases sqrt function, Universal Functions: Fast Element-wise Array Functions, Universal Functions: Fast Element-wise Array Functions square function, Universal Functions: Fast Element-wise Array Functions squeeze argument, Reading and Writing Data in Text Format SSA (Social Security Administration), US Baby Names 1880-2010 stable sorting, Alternate Sort Algorithms stacked format, Pivoting “long” to “wide” Format start index, Slicing, Slicing startswith method, String Object Methods, Vectorized string functions in pandas, Vectorized string functions in pandas statistical methods, Mathematical and Statistical Methods–Mathematical and Statistical Methods std method, Mathematical and Statistical Methods, Summarizing and Computing Descriptive Statistics, Data Aggregation stdout, Writing Data Out to Text Format step index, Slicing stop index, Slicing, Slicing strftime method, Converting between string and datetime, Dates and times strict evaluation/language, Strictness versus laziness–Strictness versus laziness, Strictness versus laziness strides/strided view, ndarray Object Internals, ndarray Object Internals strings, Data Types for ndarrays, Data Types for ndarrays, String Manipulation–Vectorized string functions in pandas, String Object Methods–String Object Methods, Regular expressions–Regular expressions, Vectorized string functions in pandas–Vectorized string functions in pandas, Converting between string and datetime–Converting between string and datetime, Strings–Strings converting to datetime, Converting between string and datetime–Converting between string and datetime data types for, Data Types for ndarrays, Data Types for ndarrays, Strings–Strings manipulating, String Manipulation–Vectorized string functions in pandas, String Object Methods–String Object Methods, Regular expressions–Regular expressions, Vectorized string functions in pandas–Vectorized string functions in pandas methods for, String Object Methods–String Object Methods vectorized string methods, Vectorized string functions in pandas–Vectorized string functions in pandas with regular expressions, Regular expressions–Regular expressions strip method, String Object Methods, Vectorized string functions in pandas, Vectorized string functions in pandas strongly-typed languages, Dynamic references, strong types–Dynamic references, strong types, Dynamic references, strong types strptime method, Converting between string and datetime, Dates and times structs, Structured and Record Arrays structured arrays, Structured and Record Arrays–Structured Array Manipulations: numpy.lib.recfunctions, Structured and Record Arrays, Nested dtypes and Multidimensional Fields–Nested dtypes and Multidimensional Fields, Why Use Structured Arrays?

Suppose we wanted to sort some data identified by first and last names: In [182]: first_name = np.array(['Bob', 'Jane', 'Steve', 'Bill', 'Barbara']) In [183]: last_name = np.array(['Jones', 'Arnold', 'Arnold', 'Jones', 'Walters']) In [184]: sorter = np.lexsort((first_name, last_name)) In [185]: zip(last_name[sorter], first_name[sorter]) Out[185]: [('Arnold', 'Jane'), ('Arnold', 'Steve'), ('Jones', 'Bill'), ('Jones', 'Bob'), ('Walters', 'Barbara')] lexsort can be a bit confusing the first time you use it because the order in which the keys are used to order the data starts with the last array passed. As you can see, last_name was used before first_name. Note pandas methods like Series’s and DataFrame’s sort_index methods and the Series order method are implemented with variants of these functions (which also must take into account missing values) Alternate Sort Algorithms A stable sorting algorithm preserves the relative position of equal elements. This can be especially important in indirect sorts where the relative ordering is meaningful: In [186]: values = np.array(['2:first', '2:second', '1:first', '1:second', '1:third']) In [187]: key = np.array([2, 2, 1, 1, 1]) In [188]: indexer = key.argsort(kind='mergesort') In [189]: indexer Out[189]: array([2, 3, 4, 0, 1]) In [190]: values.take(indexer) Out[190]: array(['1:first', '1:second', '1:third', '2:first', '2:second'], dtype='|S8') The only stable sort available is mergesort which has guaranteed O(n log n) performance (for complexity buffs), but its performance is on average worse than the default quicksort method.


pages: 474 words: 91,222

Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library by Scott Meyers

functional programming, locality of reference, sorting algorithm, Y2K

If you ask for the 20 best Widgets and some Widgets are equally good, you’re in no position to complain as long as the 20 you get back are at least as good as the ones you didn’t. For a full sort, you have slightly more control. Some sorting algorithms are stable. In a stable sort, if two elements in a range have equivalent values, their relative positions are unchanged after sorting. Hence, if Widget A precedes Widget B in the (unsorted) widgets vector and both have the same quality rating, a stable sorting algorithm will guarantee that after the vector is sorted, Widget A still precedes Widget B. An algorithm that is not stable would not make this guarantee. partial_sort is not stable.

• A predicate is a function that returns bool (or something that can be implicitly converted to bool). Predicates are widely used in the STL. The comparison functions for the standard associative containers are predicates, and predicate functions are commonly passed as parameters to algorithms like find_if and the various sorting algorithms. (For an overview of the sorting algorithms, turn to Item 31.) • A pure function is a function whose return value depends only on its parameters. If f is a pure function and x and y are objects, the return value of f(x, y) can change only if the value of x or y changes. In C++, all data consulted by pure functions are either passed in as parameters or are constant for the life of the function.

J. xvi, 114, 227 pointers allocator typedef for 48 as iterators 120 as return type from vector::begin 75 assignments, avoiding superfluous 31 comparison functions for 88–91 deleting in containers 36–40 dereferencing function object for 91 destructors for 36 invalidation, see iterator invalidation iterators in vector vs. 75 parameters, binary_function and 172 parameters, unary_function and 172 returned from reverse_iterator::base 125 smart, see smart pointers to bitfields 80 to functions, as formal parameters 34 portability #includes and 209–210 casting const iterator to iterators 121 container::<auto_ptr> and 41 explicit template argument specification and 163 hashed containers, code using 112 identity, project1st, project2nd, compose1, compose2, select1st, select2nd and 219 multiple compilers and 3 range construction with istream_iterators and 35 reverse_iterator::base and 125 set/multiset key modification and 98 stateful allocators and 50, 51 STLport STL implementation and 220 stricmp/strcmpi and 154 Potter, John xvi, xvii predicate class, definition of 166 predicates definition of 166 need to be pure functions 166–169 predicting iterator invalidation, in vector/string 68 principle of least astonishment, the 179 priority_queue 138 as part of the STL 2 project1st 219 project2nd 219 proxy objects 49 containers of 82 vector<bool> and 80 ptr_fun, reasons for 173–177 pure function, definition of 166 push_back, back_inserter and 130 push_front, front_inserter and 130 Q qsort 162 declaration for 162 sort vs. 203 queue, as part of the STL 2 R Rabinowitz, Marty xviii random access iterators definition of 5 sorting algorithms requiring 137 range destination, ensuring adequate size 129–133 member functions 25 input iterators and 29 single-element versions vs. 24–33 summary of 31–33 to avoid superfluous pointer assignments 31 sorted, algorithms requiring 146–150 summarizing 156–161 raw_storage_iterator 52 rbegin/rend, relation to begin/end 123 reallocations invalidation of iterators during 59 minimizing via reserve 66–68 rebinding allocators 53 red-black trees 190, 191, 214 redundant computations, avoiding via algorithm calls 182 Reeves, Jack xvi, 227 reference counting disabling, for string 65 multithreading and 64–65 smart pointers 39, 146 see also Boost::shared_ptr string and 64–65 The C++ Standard and 64 this book and 4 references allocator typedef for 48 casting to 98 invalidation, see iterator invalidation parameters, binary_function and 171 parameters, unary_function and 171 to bitfields 80 reference-to-reference problem 222 remove 139–142 see also erase-remove idiom on containers of pointers 143–146 partition vs. 141 remove_copy_if 44 remove_if 44, 144, 145 see also remove possible implementation of 167 replace_if 186 replacing STL implementations 243 reporting bugs in this book xii reserve insert iterators and 131 resize vs. 67 resize reallocation and 67 reserve vs. 67 resource acquisition is initialization 61 resource leaks 36, 39, 63, 144, 145 avoiding via smart pointers 39, 146 preventing via classes 61 result_type 170 return type allocator::allocate vs. operator new 51 for container::begin 75 for distance 122 for erase 32, 117 for function objects for accumulate 158 for insert 17, 32, 117 for vector::operator[] 80 reverse order inserting elements in 184 reverse_iterator base member function 123–125 other iterator types vs. 116–119 rhs, as parameter name 8–9 Rodgers, Mark xv, xvi, xvii rolling back, insertions and erasures 14 rope 218 S Scheme 206 second_argument_type 170 select1st 219 select2nd 219 sentry objects, operator<< and 126 separate heaps, allocators and 56–58 sequence containers see standard sequence containers set constness of elements 95 corrupting via element modification 97 iterator invalidation in, see iterators, invalidation keys, modifying 95–100 set_difference 148 set_intersection 148 set_symmetric_difference 148, 153 set_union 148 sgetc 127 SGI hashed containers implementation 114 iostreams implementation 220 slist implementation 218 STL web site 94, 207, 217–220 thread-safety definition at 58 URL for 217, 227 shared memory, allocators and 55–56 shared_ptr, see Boost::shared_ptr shrink to fit, see swap trick, the side effects accumulate and 160 for_each and 160 in operator++ 45 size vs. capacity 66–67 vs. empty 23–24 size_type 158 sizeof, variations when applied to string 69 skipws 126 slicing problem 21–22, 164 slist 218 small string optimization 71 Smallberg, David xvi, xvii smart pointers see also Boost::shared_ptr avoiding resource leaks with 39, 146 dereferencing function object for 91 implicit conversions and 146 sort 133–138 qsort vs. 203 sorted range algorithms requiring 146–150 sorted vectors associative containers vs. 100–106 legacy APIs and 76 sorting algorithms for 133–138 auto_ptrs 41 consistency and 149 stability, in sorting 135 stable_partition 133–138 stable_sort 133–138 stack, as part of the STL 2 Staley, Abbi xviii standard associative containers see also containers bidirectional iterators and 5 comparison funcs for pointers 88–91 definition of 5 “hint” form of insert 100, 110 iterator invalidation in, see iterators, invalidation key_comp member function 85 search complexity of 6 sorted vector vs. 100–106 typical implementation 52, 190 Standard for C++ see C++ Standard, The standard sequence containers see also containers definition of 5 erase’s return value 46 iterator invalidation in, see iterators, invalidation push_back, back_inserter and 130 push_front, front_inserter and 130 Standard Template Library, see STL Stepanov, Alexander 201 STL algorithms, vs. string member functions 230 arrays and 2 bitset and 2 complexity guarantees in 5–6 container adapters and 2 containers, selecting among 11–15 definition of 2 documentation, on-line 217 early usage problems with xi extending 2 free implementations of 217, 220 function call syntax in 175 implementations compiler implementations vs. 3 Dinkumware bug list for MSVC 244 replacing 243 member templates in 239–240 platform, see STL platform priority_queue and 2 queue and 2 stack and 2 thread safety and 58–62 valarray and 2 web sites about 217–223 wide-character strings and 4 STL platform definition of 3 Microsoft’s, remarks on 239–244 STLport 220–221 debug mode 185, 216 detecting invalidated iterators in 221 hashed containers at 112 URL for 217 strcmp 152, 234 internationalization issues and 150 lexicographical_compare and 153 strcmpi 154 streams, relation to string 230 stricmp 154 strict weak ordering 94 string allocators in 69 alternatives to 65 arrays vs. 63–66 as typedef for basic_string 65 c_str member function 75 comparisons case-insensitive 150–154, 235–237 using locales 229–237 cost of increasing capacity 66 disabling reference counting 65 embedded nulls and 75, 154 growth factor 66 implementation variations 68–73 inheriting from 37 iterator invalidation in, see iterators, invalidation iterators as pointers 120 legacy APIs and 74–77 mem funcs vs. algorithms 230 memory layout for 75 minimum allocation for 72 multithreading and 64–65 reference counting and 64–65 relation to basic_string 4, 210, 229 relation to streams 230 reserve, input iterators and 131 resize vs. reserve 67 shrink to fit 78–79 size vs. capacity 66–67 size_type typedef 158 sizeof, variations in 69 small, optimization for 71 summing lengths of 158 trimming extra capacity from 77–79 vector vs. 64 vector<char> vs. 65 whether reference-counted 65 wstring and 4 string_char_traits 211 strings case-insensitive 229–230 wide-character, see wstring Stroustrup, Bjarne xvi, 68, 226, 228 see also C++ Programming Language, The structs, vs. classes for functor classes 171 summarizing ranges 156–161 Sutter, Herb xvi, xvii, 65, 226, 227, 228 see also Exceptional C++ swap trick, the 77–79 T templates declaring inside functions 188 explicit argument specification for 122, 163, 234 instantiating with local classes 189 member in the STL 239–240 Microsoft’s platforms and 239–244 parameters, declared via class vs. typename 7 temporary objects, created via casts 98 thread safety, in containers 58–62 see also multithreading tolower 151 as inverse of toupper 235 toupper as inverse of tolower 235 cost of calling 236–237 traits classes 113, 211, 230 transactional semantics 14 transform 129, 186 traversal order, in multiset, multimap 87 trees, red-black 190, 191, 214 typedefs allocator::pointer 48 allocator::reference 48 argument_type 170 container::size_type 158 container::value_type 36 first_argument_type 170 for container and iterator types 18–19 mem_fun and 176 mem_fun_ref and 176 ptr_fun and 170 result_type 170 second_argument_type 170 string as 65 wstring as 65 typename class vs. 7 dependent types and 7–8 U unary_function 170–172 pointer parameters and 172 reference parameters and 171 undefined behavior accessing v[0] when v is empty 74 applying some algorithms to ranges of unsorted values 147 associative container comparison funcs yielding true for equal values 92 attempting to modify objects defined to be const 99 changing a set or multiset key 97 deleting an object more than once 64 deleting derived object via ptr-to-base with a nonvirtual destructor 38 detecting via STLport’s debug mode 220–221 modifying components in std 178 multithreading and 59 practical meaning of 3–4 side effects inside accumulate’s function object 160 specifying uninitialized memory as destination range for algorithms 132 using algorithms with inconsistent sort orders 149 using the wrong form of delete 64 when using invalidated iterator 27, 45, 46, 185 underscores, in identifiers 213 uninitialized_fill 52 uniq 148 unique 145, 148, 149 see also remove unique_copy 148, 149 unsigned char, use in <cctype> and <ctype.h> 151 upper_bound 197–198 related functions vs. 192–201 Urbano, Nancy L., see Perfect Woman URLs for Austern’s sample allocator 228 for auto_ptr update page 228 for Boost web site 217, 227 for Dinkumware web site 243 for Effective C++ CD demo 225 for Effective C++ CD errata list 228 for Effective C++ errata list 228 for Effective STL errata list xii for Josuttis’ sample allocator 227 for More Effective C++ errata list 228 for Persephone’s web site 71 for purchasing The C++ Standard 226 for Scott Meyers’ mailing list xiii for Scott Meyers’ web site xiii for SGI STL web site 217, 227 for STLport web site 217 for this book’s errata list xii for this book’s web site xii use_facet 234 cost of calling 234 explicit template argument specification and 234 Usenet newsgroups, see newsgroups V valarray, as part of the STL 2 value_type typedef in containers 36, 108 in iterator_traits 42 vector see also vector<bool>, vector<char> arrays vs. 63–66 contiguous memory for 74 cost of increasing capacity 66 growth factor 66 iterator invalidation in, see iterators, invalidation iterators as pointers 120 legacy APIs and 74–77 reserve, input iterators and 131 resize vs. reserve 67 return type from begin 75 shrink to fit 78–79 size vs. capacity 66–67 sorted legacy APIs and 76 vs. associative containers 100–106 string vs. 64 trimming extra capacity from 77–79 vector::insert, as member template 240 vector<bool> alternatives to 81 legacy APIs and 81 problems with 79–82 proxy objects and 80 vector<char>, vs. string 65 virtual functions, toupper and 236–237 Visual Basic 127 Visual C++, see Microsoft’s STL platforms Vlissides, John 226 see also Design Patterns vocabulary, algorithms as 186 Von, Victor xvii W Wait, John xviii wchar_t 4 web sites see also URLs for STL-related resources 217–223 whitespace, operator<< and 126 wide-character strings, see wstring Widget, use in this book 9 Wizard of Oz, The 83 Woman, Perfect, see Urbano, Nancy L.


pages: 247 words: 43,430

Think Complexity by Allen B. Downey

Benoit Mandelbrot, cellular automata, Conway's Game of Life, Craig Reynolds: boids flock, discrete time, en.wikipedia.org, Frank Gehry, Gini coefficient, Guggenheim Bilbao, Laplace demon, mandelbrot fractal, Occupy movement, Paul Erdős, peer-to-peer, Pierre-Simon Laplace, power law, seminal paper, sorting algorithm, stochastic process, strong AI, Thomas Kuhn: the structure of scientific revolutions, Turing complete, Turing machine, Vilfredo Pareto, We are the 99%

Read the Wikipedia page on sorting algorithms at http://en.wikipedia.org/wiki/Sorting_algorithm, and answer the following questions: What is a “comparison sort”? What is the best worst-case order of growth for a comparison sort? What is the best worst-case order of growth for any sort algorithm? What is the order of growth of bubble sort, and why does Barack Obama think it is “the wrong way to go”? What is the order of growth of radix sort? What preconditions do we need to use it? What is a stable sort, and why might it matter in practice? What is the worst sorting algorithm (that has a name)?

The general solution to this problem is to specify a machine model and analyze the number of steps (or operations) an algorithm requires under a given model. Relative performance might depend on the details of the dataset. For example, some sorting algorithms run faster if the data are already partially sorted; other algorithms run slower in this case. A common way to avoid this problem is to analyze the worst case scenario. It is also sometimes useful to analyze average case performance, but it is usually harder, and sometimes it is not clear what set of cases to average. Relative performance also depends on the size of the problem. A sorting algorithm that is fast for small lists might be slow for long lists. The usual solution to this problem is to express runtime (or number of operations) as a function of problem size and to compare the functions asymptotically as the problem size increases.

What is the worst sorting algorithm (that has a name)? What sort algorithm does the C library use? What sort algorithm does Python use? Are these algorithms stable? (You might have to Google around to find these answers.) Many of the non-comparison sorts are linear, so why does Python use an comparison sort? Analysis of Search Algorithms A search is an algorithm that takes a collection and a target item and determines whether the target is in the collection, often returning the index of the target. The simplest search algorithm is a “linear search,” which traverses the items of the collection in order, stopping if it finds the target.


pages: 554 words: 108,035

Scala in Depth by Tom Kleenex, Joshua Suereth

discrete time, domain-specific language, duck typing, fault tolerance, functional programming, higher-order functions, MVC pattern, sorting algorithm, type inference

Conditional execution using the type system There comes a time in an algorithm’s life when it needs to do something rather clever. This clever behavior encodes portions of the algorithm into the type system so that it can execute at compile time. An example of this could be a sort algorithm. The sort algorithm can be written against the raw Iterator interface. But if I call sort against a vector, then I’d like to be able to utilize vector’s natural array separation in my sorting algorithm. Traditionally this has been solved with two mechanisms: overloading and overriding. Using overloading, the sort method is implemented in terms of Iterable and another is implemented in terms of Vector.

But when developing new algorithms outside the collections library, type classes can come to the rescue. 8.5.1. Optimizing algorithms for each collections type You can use the type class paradigm to encode an algorithm against collections and refine that algorithm when speed improvements are possible. Let’s start by converting the generic sort algorithm from before into a type class paradigm. First we’ll define the type class for the sort algorithm. trait Sortable[A] { def sort(a : A) : A } The Sortable type class is defined against the type parameter A. The type parameter A is meant to be the full type of a collection. For example, sorting a list of integers would require a Sortable[List[Int]] object.

If more than one placeholder is used, each consecutive placeholder refers to consecutive arguments to the function literal. This notation is usually reserved for simple functions, such as the less-than (<) comparison in our Quicksort. We can apply this notation paired with operator notation to achieve the following on our quick sort algorithm: Scala offers syntactic shortcuts for simple cases, and it provides a mechanism to bend the type system via implicits conversions and implicits arguments. 1.2.4. Implicits are an old concept Scala implicits are a new take on an old concept. The first time I was ever introduced to the concept of implicit conversions was with primitive types in C++.


pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

4chan, Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, algorithmic bias, algorithmic trading, anthropic principle, asset allocation, autonomous vehicles, Bayesian statistics, behavioural economics, Berlin Wall, Big Tech, Bill Duvall, bitcoin, Boeing 747, Charles Babbage, cognitive load, Community Supported Agriculture, complexity theory, constrained optimization, cosmological principle, cryptocurrency, Danny Hillis, data science, David Heinemeier Hansson, David Sedaris, delayed gratification, dematerialisation, diversification, Donald Knuth, Donald Shoup, double helix, Dutch auction, Elon Musk, exponential backoff, fault tolerance, Fellow of the Royal Society, Firefox, first-price auction, Flash crash, Frederick Winslow Taylor, fulfillment center, Garrett Hardin, Geoffrey Hinton, George Akerlof, global supply chain, Google Chrome, heat death of the universe, Henri Poincaré, information retrieval, Internet Archive, Jeff Bezos, Johannes Kepler, John Nash: game theory, John von Neumann, Kickstarter, knapsack problem, Lao Tzu, Leonard Kleinrock, level 1 cache, linear programming, martingale, multi-armed bandit, Nash equilibrium, natural language processing, NP-complete, P = NP, packet switching, Pierre-Simon Laplace, power law, prediction markets, race to the bottom, RAND corporation, RFC: Request For Comment, Robert X Cringely, Sam Altman, scientific management, sealed-bid auction, second-price auction, self-driving car, Silicon Valley, Skype, sorting algorithm, spectrum auction, Stanford marshmallow experiment, Steve Jobs, stochastic process, Thomas Bayes, Thomas Malthus, Tragedy of the Commons, traveling salesman, Turing machine, urban planning, Vickrey auction, Vilfredo Pareto, Walter Mischel, Y Combinator, zero-sum game

For a broader overview of Flack’s work, see Flack, “Life’s Information Hierarchy.” This sporting contest is the marathon: The marathon has an analogue in the world of sorting algorithms. One of the more intriguing (Wikipedia used the word “esoteric” before the article was removed entirely) developments in beyond-comparison sorting theory arose from one of the most unlikely places: the notorious Internet message board 4chan. In early 2011, an anonymous post there proclaimed: “Man, am I a genius. Check out this sorting algorithm I just invented.” The poster’s “sorting algorithm”—Sleep Sort—creates a processing thread for each unsorted item, telling each thread to “sleep” the number of seconds of its value, and then “wake up” and output itself.

Each pass through the cards doubles the size of the sorted stacks, so to completely sort n cards you’ll need to make as many passes as it takes for the number 2, multiplied by itself, to equal n: the base-two logarithm, in other words. You can sort up to four cards in two collation passes, up to eight cards with a third pass, and up to sixteen cards with a fourth. Mergesort’s divide-and-conquer approach inspired a host of other linearithmic sorting algorithms that quickly followed on its heels. And to say that linearithmic complexity is an improvement on quadratic complexity is a titanic understatement. In the case of sorting, say, a census-level number of items, it’s the difference between making twenty-nine passes through your data set … and three hundred million.

Jordan aims to get a group of 25 or so books onto his cart before putting them in final order, which he does using an Insertion Sort. And his carefully developed strategy is exactly the right way to get there: a Bucket Sort, with his well-informed forecast of how many books he’ll have with various call numbers telling him what his buckets should be. Sort Is Prophylaxis for Search Knowing all these sorting algorithms should come in handy next time you decide to alphabetize your bookshelf. Like President Obama, you’ll know not to use Bubble Sort. Instead, a good strategy—ratified by human and machine librarians alike—is to Bucket Sort until you get down to small enough piles that Insertion Sort is reasonable, or to have a Mergesort pizza party.


pages: 1,829 words: 135,521

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython by Wes McKinney

Bear Stearns, business process, data science, Debian, duck typing, Firefox, general-purpose programming language, Google Chrome, Guido van Rossum, index card, p-value, quantitative trading / quantitative finance, random walk, recommendation engine, sentiment analysis, side project, sorting algorithm, statistical model, Two Sigma, type inference

Suppose we wanted to sort some data identified by first and last names: In [184]: first_name = np.array(['Bob', 'Jane', 'Steve', 'Bill', 'Barbara']) In [185]: last_name = np.array(['Jones', 'Arnold', 'Arnold', 'Jones', 'Walters']) In [186]: sorter = np.lexsort((first_name, last_name)) In [187]: sorter Out[187]: array([1, 2, 3, 0, 4]) In [188]: zip(last_name[sorter], first_name[sorter]) Out[188]: <zip at 0x7fa203eda1c8> lexsort can be a bit confusing the first time you use it because the order in which the keys are used to order the data starts with the last array passed. Here, last_name was used before first_name. Note pandas methods like Series’s and DataFrame’s sort_values method are implemented with variants of these functions (which also must take into account missing values). Alternative Sort Algorithms A stable sorting algorithm preserves the relative position of equal elements. This can be especially important in indirect sorts where the relative ordering is meaningful: In [189]: values = np.array(['2:first', '2:second', '1:first', '1:second', .....: '1:third']) In [190]: key = np.array([2, 2, 1, 1, 1]) In [191]: indexer = key.argsort(kind='mergesort') In [192]: indexer Out[192]: array([2, 3, 4, 0, 1]) In [193]: values.take(indexer) Out[193]: array(['1:first', '1:second', '1:third', '2:first', '2:second'], dtype='<U8') The only stable sort available is mergesort, which has guaranteed O(n log n) performance (for complexity buffs), but its performance is on average worse than the default quicksort method.

Language Semantics The Python language design is distinguished by its emphasis on readability, simplicity, and explicitness. Some people go so far as to liken it to “executable pseudocode.” Indentation, not braces Python uses whitespace (tabs or spaces) to structure code instead of using braces as in many other languages like R, C++, Java, and Perl. Consider a for loop from a sorting algorithm: for x in array: if x < pivot: less.append(x) else: greater.append(x) A colon denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block. Love it or hate it, significant whitespace is a fact of life for Python programmers, and in my experience it can make Python code more readable than other languages I’ve used.

global keyword, Namespaces, Scope, and Local Functions glue for code, Python as, Python as Glue greater function, Universal Functions: Fast Element-Wise Array Functions greater_equal function, Universal Functions: Fast Element-Wise Array Functions Greenwich Mean Time, Time Zone Handling group keys, suppressing, Suppressing the Group Keys group operationsabout, Data Aggregation and Group Operations, Advanced GroupBy Use cross-tabulation, Cross-Tabulations: Crosstab data aggregation, Data Aggregation-Returning Aggregated Data Without Row Indexes GroupBy mechanics, GroupBy Mechanics-Grouping by Index Levels pivot tables, Data Aggregation and Group Operations, Pivot Tables and Cross-Tabulation-Cross-Tabulations: Crosstab split-apply-combine, GroupBy Mechanics, Apply: General split-apply-combine-Example: Group-Wise Linear Regression unwrapped, Group Transforms and “Unwrapped” GroupBys group weighted average, Example: Group Weighted Average and Correlation groupby function, itertools module groupby method, Computations with Categoricals, numpy.searchsorted: Finding Elements in a Sorted Array GroupBy objectabout, GroupBy Mechanics-GroupBy Mechanics grouping by index level, Grouping by Index Levels grouping with dicts, Grouping with Dicts and Series grouping with functions, Grouping with Functions grouping with Series, Grouping with Dicts and Series iterating over groups, Iterating Over Groups optimized methods, Data Aggregation selecting columns, Selecting a Column or Subset of Columns selecting subset of columns, Selecting a Column or Subset of Columns groups method, Regular Expressions H %H datetime format, Dates and times, Converting Between String and Datetime h(elp) debugger command, Interactive Debugger hasattr function, Attributes and methods hash function, Valid dict key types hash maps (see dicts) hash mark (#), Comments hashability, Valid dict key types HDF5 (hierarchical data format 5), Using HDF5 Format-Using HDF5 Format, HDF5 and Other Array Storage Options HDFStore class, Using HDF5 Format head method, DataFrame heapsort method, Alternative Sort Algorithms hierarchical data format (HDF5), HDF5 and Other Array Storage Options hierarchical indexingabout, Hierarchical Indexing-Hierarchical Indexing in pandas, Reading and Writing Data in Text Format reordering and sorting levels, Reordering and Sorting Levels reshaping data with, Reshaping with Hierarchical Indexing summary statistics by level, Summary Statistics by Level with DataFrame columns, Indexing with a DataFrame’s columns %hist magic function, About Magic Commands hist method, Histograms and Density Plots histograms, Histograms and Density Plots-Histograms and Density Plots hsplit function, Concatenating and Splitting Arrays hstack function, Concatenating and Splitting Arrays HTML files, XML and HTML: Web Scraping-Parsing XML with lxml.objectify HTTP requests, Interacting with Web APIs Hugunin, Jim, NumPy Basics: Arrays and Vectorized Computation Hunter, John D., matplotlib, Plotting and Visualization I %I datetime format, Dates and times, Converting Between String and Datetime identity function, Creating ndarrays IDEs (Integrated Development Environments), Integrated Development Environments (IDEs) and Text Editors idxmax method, Summarizing and Computing Descriptive Statistics idxmin method, Summarizing and Computing Descriptive Statistics if statement, if, elif, and else iloc operator, Selection with loc and iloc, Permutation and Random Sampling immutable objects, Mutable and immutable objects, Categorical Type in pandas import conventionsfor matplotlib, A Brief matplotlib API Primer for modules, Import Conventions, Imports for Python, Import Conventions, Imports, The NumPy ndarray: A Multidimensional Array Object importlib module, Reloading Module Dependencies imshow function, Array-Oriented Programming with Arrays in keyword, Adding and removing elements, String Object Methods in-place sorts, Sorting, More About Sorting in1d method, Unique and Other Set Logic, Unique and Other Set Logic indentation in Python, Indentation, not braces index method, String Object Methods-String Object Methods, Pivot Tables and Cross-Tabulation Index objects, Index Objects-Index Objects indexes and indexingaxis indexes with duplicate labels, Axis Indexes with Duplicate Labels boolean indexing, Boolean Indexing-Boolean Indexing fancy indexing, Fancy Indexing, Fancy Indexing Equivalents: take and put for ndarrays, Basic Indexing and Slicing-Indexing with slices for pandas library, Indexing, Selection, and Filtering-Selection with loc and iloc, Axis Indexes with Duplicate Labels grouping by index level, Grouping by Index Levels hierarchical indexing, Reading and Writing Data in Text Format, Hierarchical Indexing-Indexing with a DataFrame’s columns, Reshaping with Hierarchical Indexing Index objects, Index Objects-Index Objects integer indexing, Integer Indexes merging on index, Merging on Index-Merging on Index renaming axis indexes, Renaming Axis Indexes time series data, Indexing, Selection, Subsetting time series with duplicate indexes, Time Series with Duplicate Indices timedeltas and, Time Series indexing operator, Slicing indicator variables, Computing Indicator/Dummy Variables-Computing Indicator/Dummy Variables indirect sorts, Indirect Sorts: argsort and lexsort inner join type, Database-Style DataFrame Joins input variables, Input and Output Variables insert method, Adding and removing elements, Index Objects insort function, Binary search and maintaining a sorted list int data type, Scalar Types, Type casting int function, Type casting int16 data type, Data Types for ndarrays int32 data type, Data Types for ndarrays int64 data type, Data Types for ndarrays int8 data type, Data Types for ndarrays integer arrays, indexing, Fancy Indexing, Fancy Indexing Equivalents: take and put integer indexing, Integer Indexes Integrated Development Environments (IDEs), Integrated Development Environments (IDEs) and Text Editors interactive debugger, Interactive Debugger-Other ways to make use of the debugger interpreted languages, Why Python for Data Analysis?


pages: 982 words: 221,145

Ajax: The Definitive Guide by Anthony T. Holdener

AltaVista, Amazon Web Services, business logic, business process, centre right, Citizen Lab, Colossal Cave Adventure, create, read, update, delete, database schema, David Heinemeier Hansson, en.wikipedia.org, Firefox, full text search, game design, general-purpose programming language, Guido van Rossum, information retrieval, loose coupling, machine readable, MVC pattern, Necker cube, p-value, Ruby on Rails, SimCity, slashdot, social bookmarking, sorting algorithm, SQL injection, Wayback Machine, web application

For example: /* This code is necessary for browsers that do not define DOM constants */ if (document.ELEMENT_NODE == null) { document.ELEMENT_NODE = 1; document.ATTRIBUTE_NODE = 2; document.TEXT_NODE = 3; document.CDATA_SECTION_NODE = 4; document.ENTITY_REFERENCE_NODE = 5; document.ENTITY_NODE = 6; document.PROCESSING_INSTRUCTION_NODE = 7; document.COMMENT_NODE = 8; document.DOCUMENT_NODE = 9; document.DOCUMENT_TYPE_NODE = 10; document.DOCUMENT_FRAGMENT_NODE = 11; document.NOTATION_NODE = 12; } Now, we need to decide what kind of sorting algorithm we should use on the table. I’ll just skip most of the computer science involved in determining a good algorithm, and instead will make two simple comments. One, the quick sort is the fastest search algorithm that is relatively straightforward to implement. Two, for the Web, the insertion sort is the best choice of algorithms. Why should we not use a quick sort for our sort algorithm if it is the fastest? The easiest answer for me to give is that generally, the data displayed in an Ajax application is not going to have hundreds or thousands of records that will need to be sorted.

. * @type Array */ function insertionSort(dataArray) { var j, index; /* Loop through the array to sort each value */ for (var i = 0, il = dataArray.length; i < il; i++) { index = dataArray[i]; j = i; /* Move the /dataArray/ index to the place of insertion */ while ((j > 0) && (dataArray[j - 1] > index)) { dataArray[j] = dataArray[j - 1]; j -= 1; } /* Move the current /dataArray/ index to the insertion location */ dataArray[j] = index; } return (dataArray); } Sorting Tables | 267 Sorting Algorithms The most common sorting algorithms can be separated into two groups based on their complexity. These two groups are represented by the Big O notations O(n2) and O(n log n). The computer science of Big O notation is interesting, but well beyond the scope of this book. You can find more information on the subject at http://en.wikipedia.org/wiki/ Big_O_notation.

., 14 guided tours and wizards, 148 Gutmans, Andi, 41 GwebResult objects, 586 CSS structure, 592 example of use, 589 properties, 588 GWebSearch object, 586 H handle to the image (animation object), 442 Hansson, David Heinemeier, 59 hasAttribute( ) method, 113 hasChildNodes( ) method, 113 head component of a web page, 791–792 header elements, ordering of, 292 heap sorts, 268 hierarchical databases, 47 High Performance JavaScript Vector Graphics Library, 472 hints to the user on search topics, 577–580 search submitted from hints, 580 Holovaty, Adrian, 61 HousingMaps.com, 660 HTML, 4 frames, deprecated, 321 HTML 5 draft specification, 12 produced by ASP.NET, 40 reformulation of HTML 4.01 as XML, 10 REST and, 605 standards supported by browser engines, 18 Version 4.01 Frameset DTD, 316 XHTML form elements versus, 484 (see also XHTML) html object (dojo.lfx.html), slideBy( ) and slideTo( ) methods, 465 HtmlDragSource object, 459 HtmlDragSource( ) method, 458 HtmlDropTarget object, 459 HTMLEvent module, 129 HtmlWidget object, 515 HTTP, 4, 809–815 compression, 813–815 errors and error messags, 421 938 | Index headers, 810–813 Expires header, 813 general and entity headery, 810 removing/modifying with Apache module, 812 response headers, 810 typical header from an Apache web server, 811 passing parameters to HTTP methods, 865 REST and, 605 servers, 36 status codes, 612 I IBM DB2, 45 open source version, Express-C, 46 icons images and symbols used for, 167 indicating Ajax actions, 240 id attribute <iframe> element, 322 accessing form elements by, 494 using for quick customizations, 398 utilizing its flexibility in form elements, 494 XHTML 1.0, 321 IETF (Internet Engineering Task Force), 10 Atom Syndication Format protocol (RFC 4287), 16 <iframe> elements, 320 differences from <frame> element, 322 id attribute (XHTML 1.0), 321 rendering bug fix for Internet Explorer, 347 replacing using <div> element and Ajax, 323–329 using for asynchronous file transfer, 529 using instead of <frame>, 322 IIS (Internet Information Services), 36 customizations with ISAPI filters, 812 features, 37 HTTP compression, 814 image descriptor (GIF), 435 image rollovers, 185–188 CSS for browsers without scripting, 186 JavaScript code for, 185 JavaScript turned off in the browser, 186 images in form controls, 487 indicating Ajax actions, 240 tab with multiple states in one image, 216–217 web application, 167 <img> elements, 321 implementation phase Ajax web application development, 26 software development life cycle, 23 importNode( ) function cross-browser version registering events and styles, 327–329 version for Internet Explorer, 224, 231 incrementing and decrementing operators, 829 indexes database, 836 form index, 494 page, string and database indexing, 570 indexing, page, 569 information boxes, 348 informational methods (DOM), 113 informational properties (DOM), 114 initialization methods (event), 130 initializePageStateFromURL( ) function, 245 inline documentation, 27 inline frames (see <iframe> elements) innerHTML property, 138–140 <div> element acting as table wrapper, 262 placing content into windows, 347 setting responseText to, 277 applying data to DOM document tree, 260 documented problems, 140 DOM methods versus, 259 issue with <tbody> element in Internet Explorer, 261 using to put data into a table, 258 in-place editing, 157 <input> elements adding alternative text, 488 checkbox and radio, using flexibility of id attribute, 494 images, 487 onblur( ) method, 558 placement of labels, 485 submit type, Ajax forms and, 520 tabindex attribute, 488 text, focus( ), blur( ), and select( ) methods, 493 user search input, 577 INSERT statement (SQL), 51 insertBefore( ) method, 109 insertion sorts, 268 simple implementation (example), 266 table sort (example), 270–275 insertRow( ) and insertCell( ) methods, 258 creating table rows and columns, 259 installation of web applications, time and cost savings, 674 instant messaging (IM), 691 interaction on a web page in the year 2000, 7 changes brought about by Ajax, 9 interfaces, designing, 141–172 accessibility, 167–171 Ajax interface, 171 functionality, 153–158 common web tools, 153–155 determining tools needed, 157 tools in desktop applications, 156 types of functions in applications, 153 principles for Ajax web applications consistency, 150 documentation and help, 152 feedback, 152 flexibility and efficiency, 150 navigation, 151 usability, 141–153 common design mistakes, 142–147 principles for Ajax web applications, 148–153 visualization, 158–167 application layout, 158–162 images and icons, 167 Internet Engineering Task Force (see IETF) Internet errors, 413 Internet Explorer :hover support on elements, 182 alpha transparency support, 439 alternatives to DOM stylesheet methods and properties, 127 CSS drop-down menus and, 191 CSS for image rollover, 187 CSS hack for breadcrumbs, 223 data URL problem workaround, 314 events, 133 font resizing issues, 386 :hover workaround, 188 <iframe> element, 320 importNode( ) function for, 224, 231 importNode( ) method, event attributes and, 329 innerHTML property, 138–140, 259 Index | 939 Internet Explorer (continued) innerHTML/<tbody> issue, 261 workaround, 262 JavaScript error properties, 409 rendering bug, draggable box and, 345 Trident layout engine, 18 user changes, 363 W3C DOM stylesheets Recommendation and, 117 web site information on stylesheet handling, 129 XPath, 83 Internet Information Services (see IIS) Internet protocols, 816 Internet relay chat (IRC), 691 interpreted languages, 41 intranets, use of Ajax web applications, 31 ISAPI filters, 812 httpZip, 815 ISBNdb, 650, 901 isset( ) method, 556 J Jakarta Struts framework, 60 Java, 43 applets, 734 frameworks, 60 inline documentation, 27 Java Remote Method Invocation (Java RMI), 596 JRE (Java Runtime Environment), 734 servlets, 38 Java Virtual Machines (JVMs), 44 Javadoc parser, 27 JavaScript, xiii, 5 Ajax dependence on, 916 data validation, 536–552 specialized data checking, 539–543 using Dojo Toolkit, 549–552 using regular expressions, 538 validation object, 543–549 value checking, 537 draggable <div> element, 455 dynamically creating list to be used as table of contents, 293–295 errors, 409 execution speed, 809 frameworks, toolkits, and libraries, 863–891 Gecko 1.9 implementaton of, 19 giving search hints to the user, 577–579 940 | Index image rollovers, 185 inline documentation, 27 jQuery library, 96 looping (animation object), 442–444 manipulating forms, 490–497 getting and setting element values, 493 getting form values, 490–493 modular client-side coding, 802 breaking code apart by functionality, 802 page-specific components, 803 moo.fx library, 96 optimizations, 822–830 Boolean variables, 823 code speed enhancements, 826–830 removing comments, 822 removing whitespace, 822 testing validity of a value, 823 using array and object literals, 824 produced by ASP.NET, 40 properties equivalent to CSS2 properties, 119–124 script.aculo.us library, 95, 238 slide show application, 307–311 sortable list initialization, 300 sorting, 264–275 quick sort algorithm, 269 shell sort algorithm, 268 specifications, 12 support by browser engines, 18 table pagination, 285–287 var keyword, 743 Vector Graphics Library, 472–481 Ajax application, 476–481 methods available to, 474–476 version 1.5, 12 version 1.7, 12 JavaScript Object Notation (see JSON) JavaServer Pages (JSP), 38, 44 Johnson, Rod, 60 joins (database tables), 52 JPEG image format, 437 jQuery library, 96, 889–891 functions for Ajax request functionality, 889 JRE (Java Runtime Environment), 734 JSDoc parser, 27 jsGraphics library, 704 drawLine( ) and paint( ) methods, 706 methods for drawing vector shapes, 719 setColor( ) method, 718 JSON (JavaScript Object Notation), 68, 86 array and object literals, 826 evaluating in Prototype, 866 Flickr response, 642 handing in Dojo Toolkit, 882 parsing JSON strings, 90 requests and responses, 87–90 response from Flickr, handling, 649 response from server with whiteboard points, 713 response to language change request for a form, 401 response when file is being sent, 697 script handling RAW POST as, 526 search query response, 590 web site, 87 XML versus, 92–94 JSP (JavaServer Pages), 38, 44 JVMs (Java Virtual Machines), 44 K Kaplan-Moss, Jacob, 61 KDE project, XHTML/WebCore browser engine, 18 keyboard input, 765 keypress event, 765 keys (database), 52 keyword searches, 566 through mobile technology (411Sync), 656 KHTML/WebCore, 18 standards supported, 18 L <label> elements array for changes in language-switching process, 399 placement for form elements, 485–488 switching data to JSON response, 402 LAMP (Linux, Apache, MySQL, PHP [Perl/Python]), 42 languages changing site language with Ajax, 400–403 programming (see programming languages) switching forms to different languages, 399–400 Last.fm, 632, 901 lastColumn property, 276 JavaScript insertion sort, 271 layers, 804 layout application, 158–162 balance, 159 consistency in, 161 density, 160 general layout and navigation, 246 images, 167 linear layout on classic web sites, 158 modern example of linear approach, 160 organic layout, 158 rule of thirds (or golden ratio), 163 using CSS, 250–252 using tables, 247–250 page, 329–334 dynamic nature of pages, 330 separating structure from presentation, 333–334 site navigation, 175–246 menus, 175–212 layout engines (browser), 17 <legend> elements, 489 Lerdorf, Rasmus, 41 libraries, 453 book information database (ISBNdb), 650 jQuery, 889–891 MochiKit, 886–889 MooTools, 877–880 Rico, 875 Sarissa, 884–885 licenses, professional, 667 linear collision detection, 762–764 linear layout classic web sites, 158 modern approach used in CSS Zen Garden, 160 line-of-sight path movement, 748 links alternate stylesheet, 364 bottom-of-a-page navigation aids, 226–228 color theme switches, 397 main and alternate stylesheet links, 371 testing for broken links, 26 Linux, alert windows, 335 Listamatic (web site), 223 Index | 941 lists, 291–315 slide show application, 304–315 CSS styling rules, 305–307 JavaScript code, 307–311 server-side PHP script to send images, 312–314 working slide show using Ajax, 311 sortable, 297–302 Ajax and draggable lists, 302–304 CSS styling rules, 300 Dojo Toolkit, using, 301–302 JavaScript to initialize and make functional, 300 table of contents, 292–297 CSS rules for numbering system, 296 dynamically created with JavaScript, 293–295 trees of lists used in navigation boxes, 231–235 Zapatec DHTML Tree, 234 use of XHTML lists in menus, 192 used for navigation functionality, 291 using XHTML lists in breadcrumbs, 222 using XHTML lists in links at bottom of a page, 226 using XHTML lists in tabs, 213 vertical lists in navigation boxes, 235 load testing, 26 load( ) method, 79 loadMenu( ) function, 211 loadXML( ) method (Document), 79, 81 local variables, 828 logging errors, 420 logError.php script, 427 PHP constants, 410 Logic class circular collision testing, 760 linear collision detection, 762 rectangular collision testing, 755 logic.js file (game logic), 776–778 login (chat application example), 676–678 JavaScript to check username validity, 676 login.php file, 678 logout.php file (chat application), 686 look of an application (see visualization) lookup table (LUT) of square root values, 760 Luhn Formula, 541, 543 Luhn, Hans Peter, 543 942 | Index M Ma.gnolia (bookmark service), 624, 902 Mac OS X, alert windows, 335 Macromedia Flash, 733 Shockwave, 734 magnifiers, 157 mail client (Gmail), use of Ajax, 172 maintenance (software development), 23 manipulation methods (DOM), 110–112 <map> elements, 321 Map24 AJAX, 628, 902 mapping services, 627–631 list of popular mapping services, 627 mashups and, 661 business opportunities, 671 Yahoo!


pages: 509 words: 92,141

The Pragmatic Programmer by Andrew Hunt, Dave Thomas

A Pattern Language, Broken windows theory, business logic, business process, buy low sell high, c2.com, combinatorial explosion, continuous integration, database schema, domain-specific language, don't repeat yourself, Donald Knuth, Ford Model T, Free Software Foundation, general-purpose programming language, George Santayana, Grace Hopper, higher-order functions, if you see hoof prints, think horses—not zebras, index card, Kaizen: continuous improvement, lateral thinking, loose coupling, Menlo Park, MVC pattern, off-by-one error, premature optimization, Ralph Waldo Emerson, revision control, Schrödinger's Cat, slashdot, sorting algorithm, speech recognition, systems thinking, the Cathedral and the Bazaar, traveling salesman, urban decay, Y2K

Examples include exhaustive searches, finding the maximum value in an array, and generating checksums. Nested loops. If you nest a loop inside another, then your algorithm becomes O(m x n), where m and n are the two loops' limits. This commonly occurs in simple sorting algorithms, such as bubble sort, where the outer loop scans each element in the array in turn, and the inner loop works out where to place that element in the sorted result. Such sorting algorithms tend to be O(n2). Binary chop. If your algorithm halves the set of things it considers each time around the loop, then it is likely to be logarithmic, O(lg(n)) (see Exercise 37, page 183). A binary search of a sorted list, traversing a binary tree, and finding the first set bit in a machine word can all be O(lg(n)).

In most C and C++ implementations, you'll find some form of assert or _assert macro that checks a Boolean condition. These macros can be invaluable. If a pointer passed in to your procedure should never be NULL, then check for it: void writeString(char *string) { assert(string != NULL); ... Assertions are also useful checks on an algorithm's operation. Maybe you've written a clever sort algorithm. Check that it works: for (int i = 0; i < num_entries-1; i++) { assert(sorted[i] <= sorted[i+1]); } Of course, the condition passed to an assertion should not have a side effect (see the box on page 124). Also remember that assertions may be turned off at compile time—never put code that must be executed into an assert.

If you test a sort routine with random input keys, you may be surprised the first time it encounters ordered input. Pragmatic Programmers try to cover both the theoretical and practical bases. After all this estimating, the only timing that counts is the speed of your code, running in the production environment, with real data.[2] This leads to our next tip. [2] In fact, while testing the sort algorithms used as an exercise for this section on a 64MB Pentium, the authors ran out of real memory while running the radix sort with more than seven million numbers. The sort started using swap space, and times degraded dramatically. Tip 46 Test Your Estimates If it's tricky getting accurate timings, use code profilers to count the number of times the different steps in your algorithm get executed, and plot these figures against the size of the input.


The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns, Aaron Roth

23andMe, affirmative action, algorithmic bias, algorithmic trading, Alignment Problem, Alvin Roth, backpropagation, Bayesian statistics, bitcoin, cloud computing, computer vision, crowdsourcing, data science, deep learning, DeepMind, Dr. Strangelove, Edward Snowden, Elon Musk, fake news, Filter Bubble, general-purpose programming language, Geoffrey Hinton, Google Chrome, ImageNet competition, Lyft, medical residency, Nash equilibrium, Netflix Prize, p-value, Pareto efficiency, performance metric, personalized medicine, pre–internet, profit motive, quantitative trading / quantitative finance, RAND corporation, recommendation engine, replication crisis, ride hailing / ride sharing, Robert Bork, Ronald Coase, self-driving car, short selling, sorting algorithm, sparse data, speech recognition, statistical model, Stephen Hawking, superintelligent machines, TED Talk, telemarketer, Turing machine, two-sided market, Vilfredo Pareto

As the previous paragraph suggests, computer science has traditionally focused on algorithmic trade-offs related to what we might consider performance metrics, including computational speed, the amount of memory required, or the amount of communication required between algorithms running on separate computers. But this book, and the emerging research it describes, is about an entirely new dimension in algorithm design: the explicit consideration of social values such as privacy and fairness. Man Versus Machine (Learning) Algorithms such as the sorting algorithm we describe above are typically coded by the scientists and engineers who design them: every step of the procedure is explicitly specified by its human designers, and written down in a general-purpose programming language such as Python or C++. But not all algorithms are like this. More complicated algorithms—the type that we categorize as machine learning algorithms—are automatically derived from data.

To describe what differential privacy asks for, it is important to first understand what a randomized algorithm is. Let’s start with an algorithm. Remember, this is just a precise description of how to take inputs to some problem and process them to arrive at the desired outputs. In the introduction, we saw an example of a sorting algorithm. But an algorithm might also take as input health records and map them to a set of features that appear to be correlated. Or it might take as input video rental records and map them to a set of movie recommendations for each customer. The important thing is that the algorithm precisely specifies the process by which the inputs are mapped to the outputs.

The problem with computing machines is not that they won’t do what they are programmed to, but rather that they will do exactly what they are programmed to. This is a problem because it can be hard to anticipate exactly what a computer is programmed to do in a particular situation. It is true that for comparatively simple programs—a word processor, for example, or a sorting algorithm—the programmer will precisely specify what the program should do in every situation. But as we have discussed elsewhere, this is not how machine learning works. Consider the seemingly easy task of distinguishing pictures of cats from pictures of dogs. It is too difficult for anyone to enunciate precisely how to perform this task, even though three-year-olds can solve the problem easily and reliably.


Pearls of Functional Algorithm Design by Richard Bird

bioinformatics, data science, functional programming, Kickstarter, Menlo Park, sorting algorithm

Each test of f (x , y) against z has three possible outcomes, so the height h of the ternary tree of tests has to satisfy h ≥ log3 A(m, n). Provided we can estimate A(m, n) this gives us a lower bound on the number of tests that have to be performed. The situation is the same with sorting n items by binary comparisons; there are n! possible outcomes, so any sorting algorithm has to make at least log2 n! comparisons in the worst case. Improving on saddleback search 17 It is easy to estimate A(m, n): each list of pairs (x , y) in the range 0 ≤ x < n and 0 ≤ y < m with f (x , y) = z is in a one-to-one correspondence with a step shape from the top-left corner of the m × n rectangle to the bottom-right corner, in which the value z appears at the inner corners of the steps.

i ) (12.8) In words, one can partition a list of pairs by first partitioning with respect to first components and refining the result using the second components. The correct second components can be installed in each run because each run is a list of positions. After installation, each run is partition sorted and the results concatenated. To be accurate, (12.7) holds only if psort is a stable sorting algorithm and the implementation in Figure 12.1 is not stable. If psort is not stable then the elements in each run will appear in a different order in the leftand right-hand sides. But (12.7) does hold if we interpret equality of two partitions to mean equal up to some permutation of the elements in each run.

The third identity, the key one, is not so obvious: hdsort · map rrot · sort · rots = sort · rots (13.5) In words, the following transformation on a matrix of sorted rotations is the identity: move the last column to the front and then resort the rows on the new first column. In fact, (13.5) is true only if hdsort is a stable sorting algorithm, meaning that columns with the same first element appear in the output in the same order that they appeared in the input. Under this assumption we have, applied to an n × n matrix, that sort = (hdsort · map rrot)n This identity states that one can sort an n × n matrix (in fact, an arbitrary list of lists all of which have length n) by repeating n times the operation of rotating the last column into first position and then stably sorting according to the first column only.


pages: 238 words: 93,680

The C Programming Language by Brian W. Kernighan, Dennis M. Ritchie

Dennis Ritchie, general-purpose programming language, Ken Thompson, machine readable, sorting algorithm

We will illustrate this by modifying the sorting procedure written earlier in this chapter so that if the optional argument -n is given, it will sort the input lines numerically instead of lexicographically. A sort often consists of three parts - a comparison that determines the ordering of any pair of objects, an exchange that reverses their order, and a sorting algorithm that makes comparisons and exchanges until the objects are in order. The sorting algorithm is independent of the comparison and exchange operations, so by passing different comparison and exchange functions to it, we can arrange to sort by different criteria. This is the approach taken in our new sort. Lexicographic comparison of two lines is done by strcmp, as before; we will also need a routine numcmp that compares two lines on the basis of numeric value and returns the same kind of condition indication as strcmp does.

-1 : 1; if (s[i] == '+' || s[i] == '-') i++; for (n = 0; isdigit(s[i]); i++) n = 10 * n + (s[i] - '0'); return sign * n; /* skip white space */ /* skip sign */ } The standard library provides a more elaborate function strtol for conversion of strings to long integers; see Section 5 of Appendix B. The advantages of keeping loop control centralized are even more obvious when there are several nested loops. The following function is a Shell sort for sorting an array of integers. The basic idea of this sorting algorithm, which was invented in 1959 by D. L. Shell, is that in early stages, far-apart elements are compared, rather than adjacent ones as in simpler interchange sorts. This tends to eliminate large amounts of disorder quickly, so later stages have less work to do. The interval between compared elements is gradually decreased to one, at which point the sort effectively becomes an adjacent interchange method. /* shellsort: sort v[0]...v[n-1] into increasing order */ void shellsort(int v[], int n) { 58 int gap, i, j, temp; for (gap = n/2; gap > 0; gap /= 2) for (i = gap; i < n; i++) for (j=i-gap; j>=0 && v[j]>v[j+gap]; j-=gap) { temp = v[j]; v[j] = v[j+gap]; v[j+gap] = temp; } } There are three nested loops.

This in printd(123) the first printd receives the argument n = 123. It passes 12 to a second printd, which in turn passes 1 to a third. The third-level printd prints 1, then returns to the second level. That printd prints 2, then returns to the first level. That one prints 3 and terminates. Another good example of recursion is quicksort, a sorting algorithm developed by C.A.R. Hoare in 1962. Given an array, one element is chosen and the others partitioned in two subsets - those less than the partition element and those greater than or equal to it. The same process is then applied recursively to the two subsets. When a subset has fewer than two elements, it doesn't need any sorting; this stops the recursion.


pages: 263 words: 20,730

Exploring Python by Timothy Budd

c2.com, centre right, duck typing, functional programming, general-purpose programming language, Guido van Rossum, higher-order functions, index card, random walk, sorting algorithm, web application

Example: QuickSort The sorting algorithm quick sort provides another illustration of how a problem can be described as a transformation. Quick sort is a recursive algorithm that works by (a) Exploring Python – Chapter 8: Functional Programming 7 selecting some element, termed the pivot, (b) dividing the original list into three parts, namely those that are smaller than the pivot, those equal to the pivot, and those larger than the pivot, and (c) recursively sorting the first and third, appending the results to obtain the final solution. Once you have described the quick sort algorithm in this fashion, the solution is a simple transliteration: def quicksort(a): if a: # there are various ways of selecting the pivot # we simply choose the middle element pivot = a[len(a)/2] return (quickSort([x for x in a if x < pivot]) + [x for x in a if x == pivot] + quickSort([x for x in a if x > pivot])) else: return [ ] We have illustrated higher order functions by passing lambda expressions to functions such as filter and map.

Example – File Sort As an example program presented earlier shows, it is easy to sort the lines of a file if your computer has sufficient memory to maintain the contents of the file in a list. Simply read the file into the list, sort the list, then write the list out to a new file. But what if you have a very large file, one that is too big to fit into memory? The algorithm used to solve this problem is known as file sort. The file sort algorithm uses a number of temporary files as intermediate storage areas. The approach works in three steps. In step 1, the original file is read in small units, say 100 lines at a time. Each unit is sorted and written out to a temporary file. Once these have been created the second step begins. In this step pairs of temporary files are merged into a new file.

Once you have described the quick sort algorithm in this fashion, the solution is a simple transliteration: def quicksort(a): if a: # there are various ways of selecting the pivot # we simply choose the middle element pivot = a[len(a)/2] return (quickSort([x for x in a if x < pivot]) + [x for x in a if x == pivot] + quickSort([x for x in a if x > pivot])) else: return [ ] We have illustrated higher order functions by passing lambda expressions to functions such as filter and map. The flip side is to write a function that accepts a function as argument. For example, you might want a sorting function that allows the user to provide the comparison test as an argument, rather than using the < operator. The quick sort algorithm rewritten to allow the comparison test to be passed as argument is as follows: def quicksort(a, cmp): if a: pivot = a[len(a)/2] return (quicksort([x for x in a if cmp(x, pivot)],cmp)+ [x for x in a if x == pivot] + quicksort([x for x in a if cmp(pivot, x)], cmp)) else: return [ ] This version of quicksort could be invoked as follows: >>> a = [1, 6, 4, 2, 5, 3, 7] >>> print quicksort(a, lambda x, y: x > y) [7, 6, 5, 4, 3, 2, 1] # sort backwards Simple Reductions Many common tasks can be implemented as a form of reduction.


Algorithms in C++ Part 5: Graph Algorithms by Robert Sedgewick

Erdős number, functional programming, linear programming, linked data, NP-complete, reversible computing, search costs, sorting algorithm, traveling salesman

• 19.102 Given a DAG, does there exist a topological sort that cannot result from applying the source-queue algorithm, no matter what queue discipline is used? Prove your answer. 19.103 Modify the source-queue topological-sort algorithm to use a generalized queue. Use your modified algorithm with a LIFO queue, a stack, and a randomized queue. • 19.104 Use Program 19.8 to provide an implementation for a class for verifying that a DAG has no cycles (see Exercise 19.75). • 19.105 Convert the source-queue topological-sort algorithm into a sink-queue algorithm for reverse topological sorting. 19.106 Write a program that generates all possible topological orderings of a given DAG, or, if the number of such orderings exceeds a bound taken as an argument, prints that number. 19.107 Write a program that converts any digraph with V vertices and E edges into a DAG by doing a DFS-based topological sort and changing the orientation of any back edge encountered.

Carefully chosen examples, detailed figures, and complete implementations supplement thorough descriptions of algorithms and applications. Algorithms This book is the second of three volumes that are intended to survey the most important computer algorithms in use today. The first volume (Parts 1–4) covers fundamental concepts (Part 1), data structures (Part 2), sorting algorithms (Part 3), and searching algorithms (Part 4); this volume (Part 5) covers graphs and graph algorithms; and the (yet to be published) third volume (Parts 6–8) covers strings (Part 6), computational geometry (Part 7), and advanced algorithms and applications (Part 8). The books are useful as texts early in the computer science curriculum, after students have acquired basic programming skills and familiarity with computer systems, but before they have taken specialized courses in advanced areas of computer science or computer applications.

But once we have been to V +1 vertices, we must have seen a directed cycle, by the pigeonhole principle (see Property 19.6), which contradicts the assumption that we have a DAG. Therefore, every DAG has at least one sink. It follows that every DAG also has at least one source: its reverse’s sink. From this fact, we can derive a topological-sort algorithm: Label any source with the smallest unused label, then remove it and label the rest of the DAG, using the same algorithm. Figure 19.25 is a trace of this algorithm in operation for our sample DAG. Implementing this algorithm efficiently is a classic exercise in data-structure design ( see reference section ).


pages: 499 words: 144,278

Coders: The Making of a New Tribe and the Remaking of the World by Clive Thompson

"Margaret Hamilton" Apollo, "Susan Fowler" uber, 2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, 4chan, 8-hour work day, Aaron Swartz, Ada Lovelace, AI winter, air gap, Airbnb, algorithmic bias, AlphaGo, Amazon Web Services, Andy Rubin, Asperger Syndrome, augmented reality, Ayatollah Khomeini, backpropagation, barriers to entry, basic income, behavioural economics, Bernie Sanders, Big Tech, bitcoin, Bletchley Park, blockchain, blue-collar work, Brewster Kahle, Brian Krebs, Broken windows theory, call centre, Cambridge Analytica, cellular automata, Charles Babbage, Chelsea Manning, Citizen Lab, clean water, cloud computing, cognitive dissonance, computer vision, Conway's Game of Life, crisis actor, crowdsourcing, cryptocurrency, Danny Hillis, data science, David Heinemeier Hansson, deep learning, DeepMind, Demis Hassabis, disinformation, don't be evil, don't repeat yourself, Donald Trump, driverless car, dumpster diving, Edward Snowden, Elon Musk, Erik Brynjolfsson, Ernest Rutherford, Ethereum, ethereum blockchain, fake news, false flag, Firefox, Frederick Winslow Taylor, Free Software Foundation, Gabriella Coleman, game design, Geoffrey Hinton, glass ceiling, Golden Gate Park, Google Hangouts, Google X / Alphabet X, Grace Hopper, growth hacking, Guido van Rossum, Hacker Ethic, hockey-stick growth, HyperCard, Ian Bogost, illegal immigration, ImageNet competition, information security, Internet Archive, Internet of things, Jane Jacobs, John Markoff, Jony Ive, Julian Assange, Ken Thompson, Kickstarter, Larry Wall, lone genius, Lyft, Marc Andreessen, Mark Shuttleworth, Mark Zuckerberg, Max Levchin, Menlo Park, meritocracy, microdosing, microservices, Minecraft, move 37, move fast and break things, Nate Silver, Network effects, neurotypical, Nicholas Carr, Nick Bostrom, no silver bullet, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, Oculus Rift, off-the-grid, OpenAI, operational security, opioid epidemic / opioid crisis, PageRank, PalmPilot, paperclip maximiser, pattern recognition, Paul Graham, paypal mafia, Peter Thiel, pink-collar, planetary scale, profit motive, ransomware, recommendation engine, Richard Stallman, ride hailing / ride sharing, Rubik’s Cube, Ruby on Rails, Sam Altman, Satoshi Nakamoto, Saturday Night Live, scientific management, self-driving car, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, single-payer health, Skype, smart contracts, Snapchat, social software, software is eating the world, sorting algorithm, South of Market, San Francisco, speech recognition, Steve Wozniak, Steven Levy, systems thinking, TaskRabbit, tech worker, techlash, TED Talk, the High Line, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, universal basic income, urban planning, Wall-E, Watson beat the top human players on Jeopardy!, WeWork, WikiLeaks, women in the workforce, Y Combinator, Zimmermann PGP, éminence grise

Its massive, shared attention pool has made News Feed one of the surest vectors by which a piece of culture goes viral, from a tear-jerky video of a kind act to an outtake by Beyoncé, from the hopeful, pro-democratic beginnings of the Arab Spring to virulent ISIS recruitment videos. News Feed tied people together and propelled a host of acronymized pop-psychology ailments, from TMI to FOMO. The feed got people to stare at Facebook a lot—on average, 35 minutes a day for each American. It’s not hard to see why. The feed’s sorting algorithm is designed to give you more of what you like; it pays close attention to everything you do on Facebook—your “likes,” your reposts, your comments—the better to find new items to show you, that, the programmers hope, match neatly with your preferences. Giving people mostly what they want to see makes for a terrific business, of course, which is why Facebook made about $40 billion a year in advertising in 2017.

Any opportunity they have to automate a process, to do something more efficiently, they will. (Sanghvi even extended this to her nuptials. Her mother wanted to arrange Sanghvi’s marriage in India, and Sanghvi agreed, because it struck her as far more efficient than dating—which you could regard, from a purely computer-scientific point of view, as a woefully resource-intensive sorting algorithm. “I was a big fan of arranged marriages,” as she said once in a speech. “It appealed to the engineer in me. They’re practical, and they had a higher probability of succeeding.”) This instinctive desire to optimize—and scale—is what has led to many collisions between software firms and civic life.

This is the only way I know how to do it.’ ” But ultimately the Google engineers had quite a bit of respect for what he and Rainert had built, Crowley says. Sure, they could code circles around Crowley, and there were thousands of them. If these engineers were given a coding challenge, they could knock it out of the park. They could optimize a sorting algorithm so that it executes in 15 milliseconds instead of 150 milliseconds; a 10X improvement from a 10X coder. But Crowley had something equally as powerful, if even more so: a bold, crazy new idea like Dodgeball. He had a unique way of looking at the world, while wandering from bar to bar, noticing the joy that came from friends sharing their travels through a dazzling, mazelike city.


pages: 923 words: 516,602

The C++ Programming Language by Bjarne Stroustrup

combinatorial explosion, conceptual framework, database schema, Dennis Ritchie, distributed generation, Donald Knuth, fault tolerance, functional programming, general-purpose programming language, higher-order functions, index card, iterative process, job-hopping, L Peter Deutsch, locality of reference, Menlo Park, no silver bullet, Parkinson's law, premature optimization, sorting algorithm

Section 13.4 Using Template Arguments to Specify Policy 339 like to use for a comparison? Two different collating sequences (numerical orderings of the characters) are commonly used for sorting Swedish names. Naturally, neither a general string type nor a general sort algorithm should know about the conventions for sorting names in Sweden. Therefore, any general solution requires that the sorting algorithm be expressed in general terms that can be defined not just for a specific type but also for a specific use of a specific type. For example, let us generalize the standard C library function ssttrrccm mpp() for SSttrriinnggs of any type T (§13.2): tteem mppllaattee<ccllaassss T T, ccllaassss C C> iinntt ccoom mppaarree(ccoonnsstt SSttrriinngg<T T>& ssttrr11, ccoonnsstt SSttrriinngg<T T>& ssttrr22) { ffoorr(iinntt ii=00; ii<ssttrr11.lleennggtthh() && ii< ssttrr22.lleennggtthh(); ii++) iiff (!

Remove eexxtteerrnn declarations from .cc files and if necessary convert all function definitions to C++ function definition syntax. Replace calls of m maalllloocc() and ffrreeee() with nneew w and ddeelleettee. Remove unnecessary casts. 6. (∗2) Implement ssssoorrtt() (§7.7) using a more efficient sorting algorithm. Hint: qqssoorrtt(). 7. (∗2.5) Consider: ssttrruucctt T Tnnooddee { ssttrriinngg w woorrdd; iinntt ccoouunntt; T Tnnooddee* lleefftt; T Tnnooddee* rriigghhtt; }; The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc.

Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved. 328 Templates Chapter 13 sophisticated implementations behind simple interfaces and to expose complexity to the user only when the user has a specific need for it. For example, ssoorrtt(vv) can be the interface to a variety of sort algorithms for elements of a variety of types held in a variety of containers. The sort function that is most appropriate for the particular v will be automatically chosen. Every major standard library abstraction is represented as a template (for example, ssttrriinngg, oossttrreeaam m, ccoom mpplleexx, lliisstt, and m maapp) and so are the key operations (for example, ssttrriinngg compare, the output operator <<, ccoom mpplleexx addition, getting the next element from a lliisstt, and ssoorrtt()).


pages: 292 words: 62,575

97 Things Every Programmer Should Know by Kevlin Henney

A Pattern Language, active measures, Apollo 11, business intelligence, business logic, commoditize, continuous integration, crowdsourcing, database schema, deliberate practice, domain-specific language, don't repeat yourself, Donald Knuth, fixed income, functional programming, general-purpose programming language, Grace Hopper, index card, inventory management, job satisfaction, level 1 cache, loose coupling, machine readable, Silicon Valley, sorting algorithm, The Wisdom of Crowds

When you're designing an application, be mindful of the number of interprocess communications in response to each stimulus. When analyzing applications that suffer from poor performance, I have often found IPC-to-stimulus ratios of thousands-to-one. Reducing this ratio, whether by caching or parallelizing or some other technique, will pay off much more than changing data structure choice or tweaking a sorting algorithm. * * * [6] http://martinfowler.com/eaaCatalog/lazyLoad.html Chapter 42. Keep the Build Clean Johannes Brodwall HAVE YOU EVER LOOKED AT a list of compiler warnings the length of an essay on bad coding and thought to yourself, "You know, I really should do something about that…but I don't have time just now"?

Test Precisely and Concretely Kevlin Henney IT IS IMPORTANT TO TEST for the desired, essential behavior of a unit of code, rather than for the incidental behavior of its particular implementation. But this should not be taken or mistaken as an excuse for vague tests. Tests need to be both accurate and precise. Something of a tried, tested, and testing classic, sorting routines offer an illustrative example. Implementing a sorting algorithm is not necessarily an everyday task for a programmer, but sorting is such a familiar idea that most people believe they know what to expect from it. This casual familiarity, however, can make it harder to see past certain assumptions. When programmers are asked, "What would you test for?", by far and away the most common response is something like, "The result of sorting is a sorted sequence of elements."


pages: 375 words: 66,268

High Performance JavaScript by Nicholas C. Zakas

en.wikipedia.org, Firefox, Google Chrome, sorting algorithm, web application

Iterative algorithms typically consist of several different loops performing different aspects of the process, and thus introduce their own performance issues. However, using optimized loops in place of long-running recursive functions can result in performance improvements due to the lower overhead of loops versus that of executing a function. As an example, the merge sort algorithm is most frequently implemented using recursion. A simple JavaScript implementation of merge sort is as follows: function merge(left, right){ var result = []; while (left.length > 0 && right.length > 0){ if (left[0] < right[0]){ result.push(left.shift()); } else { result.push(right.shift()); } } return result.concat(left).concat(right); } function mergeSort(items){ if (items.length == 1) { return items; } var middle = Math.floor(items.length / 2), left = items.slice(0, middle), 76 | Chapter 4: Algorithms and Flow Control right = items.slice(middle); return merge(mergeSort(left), mergeSort(right)); } The code for this merge sort is fairly simple and straightforward, but the mergeSort() function itself ends up getting called very frequently.

An array of n items ends up calling mergeSort() 2 * n –1 times, meaning that an array with more than 1,500 items would cause a stack overflow error in Firefox. Running into the stack overflow error doesn’t necessarily mean the entire algorithm has to change; it simply means that recursion isn’t the best implementation. The merge sort algorithm can also be implemented using iteration, such as: //uses the same mergeSort() function from previous example function mergeSort(items){ if (items.length == 1) { return items; } var work = []; for (var i=0, len=items.length; i < len; i++){ work.push([items[i]]); } work.push([]); //in case of odd number of items for (var lim=len; lim > 1; lim = (lim+1)/2){ for (var j=0,k=0; k < lim; j++, k+=2){ work[j] = merge(work[k], work[k+1]); } work[j] = []; //in case of odd number of items } return work[0]; } This implementation of mergeSort() does the same work as the previous one without using recursion.


pages: 236 words: 67,823

Hacking Vim 7.2 by Kim Schulz

Alexey Pajitnov wrote Tetris, Bram Moolenaar, Debian, revision control, sorting algorithm

You simply have to use the sort()function on the list you get out of the keys() function. let mydict = {a: "apple", b:"banana", c: "citrus" } for keyvar in sort(keys(mydict)) echo mydict[keyvar] endfor This, of course, requires that the names of the keys can be ordered individually by using a normal sort algorithm. In the previous case, there is no problem because a is before b, which is before c. The sort function can actually take another argument, which is a function name. This way you can make your own sort algorithm to use when sorting special values. See :help sort() for more information and an example. [ 165 ] Basic Vim Scripting While loops The next type of loop we will look at is the while loop.


pages: 410 words: 119,823

Radical Technologies: The Design of Everyday Life by Adam Greenfield

3D printing, Airbnb, algorithmic bias, algorithmic management, AlphaGo, augmented reality, autonomous vehicles, bank run, barriers to entry, basic income, bitcoin, Black Lives Matter, blockchain, Boston Dynamics, business intelligence, business process, Californian Ideology, call centre, cellular automata, centralized clearinghouse, centre right, Chuck Templeton: OpenTable:, circular economy, cloud computing, Cody Wilson, collective bargaining, combinatorial explosion, Computer Numeric Control, computer vision, Conway's Game of Life, CRISPR, cryptocurrency, David Graeber, deep learning, DeepMind, dematerialisation, digital map, disruptive innovation, distributed ledger, driverless car, drone strike, Elon Musk, Ethereum, ethereum blockchain, facts on the ground, fiat currency, fulfillment center, gentrification, global supply chain, global village, Goodhart's law, Google Glasses, Herman Kahn, Ian Bogost, IBM and the Holocaust, industrial robot, informal economy, information retrieval, Internet of things, Jacob Silverman, James Watt: steam engine, Jane Jacobs, Jeff Bezos, Jeff Hawkins, job automation, jobs below the API, John Conway, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Perry Barlow, John von Neumann, joint-stock company, Kevin Kelly, Kickstarter, Kiva Systems, late capitalism, Leo Hollis, license plate recognition, lifelogging, M-Pesa, Mark Zuckerberg, means of production, megacity, megastructure, minimum viable product, money: store of value / unit of account / medium of exchange, natural language processing, Network effects, New Urbanism, Nick Bostrom, Occupy movement, Oculus Rift, off-the-grid, PalmPilot, Pareto efficiency, pattern recognition, Pearl River Delta, performance metric, Peter Eisenman, Peter Thiel, planetary scale, Ponzi scheme, post scarcity, post-work, printed gun, proprietary trading, RAND corporation, recommendation engine, RFID, rolodex, Rutger Bregman, Satoshi Nakamoto, self-driving car, sentiment analysis, shareholder value, sharing economy, Shenzhen special economic zone , Sidewalk Labs, Silicon Valley, smart cities, smart contracts, social intelligence, sorting algorithm, special economic zone, speech recognition, stakhanovite, statistical model, stem cell, technoutopianism, Tesla Model S, the built environment, The Death and Life of Great American Cities, The Future of Employment, Tony Fadell, transaction costs, Uber for X, undersea cable, universal basic income, urban planning, urban sprawl, vertical integration, Vitalik Buterin, warehouse robotics, When a measure becomes a target, Whole Earth Review, WikiLeaks, women in the workforce

In the United States, for example, the Federal Trade Commission’s inventory of Equal Credit Opportunity Rights explicitly “prohibits credit discrimination on the basis of race, color, religion, national origin, sex, marital status, age, or because you get public assistance,” and this is intended to protect certain classes of people who have historically been denied access to financing.67 Without access to an algorithm, there is no way of knowing whether it observes those provisions—or, perhaps more worryingly, whether the behaviors it weighs transparently serve as proxies for factors that lenders are specifically forbidden by law to consider in their provision of credit. And finally, without access to its composition, we can’t reconstruct whether the conclusions an algorithm arrives at bear even the slightest relationship to someone’s actual propensity to repay a loan. Like any other sorting algorithm, the ones used in the determination of creditworthiness always direct our attention to a subset of the information that is available. That information may have less bearing on someone’s trustworthiness than other facts which might well be more salient, but which by their nature are less accessible to the lender.

As Oxford researchers Bryce Goodman and Seth Flaxman point out, then, the EU regulation is either too narrowly written to be effective, or so broadly interpretable as to be unenforceable. This suggests that it isn’t so much the obscurity of any specific algorithm that presents would-be regulators with their greatest challenge, but the larger obscurity of the way in which sorting algorithms work. And this impression is reinforced by the law’s second major provision, which aims directly at the question of algorithmic opacity. Its Articles 12 and 13 create “the right to an explanation,” requiring that anyone affected by the execution of some algorithmic system be offered the means to understand exactly how it arrived at its judgment, in a “concise, intelligible and easily accessible form, using clear and plain language.”

This clearly relies entirely too much on the initiative, the bravery and the energy of the individual, and fails to account for those situations, and they will be many, in which that individual is not offered any meaningful choice of action. Furthermore, this sort of accountability is ill-suited to the time scale in which algorithmic decisions take place—which is to say, in real time. Explanation and redress are by definition reactive and ex post facto. The ordinary operation of a sorting algorithm will generally create a new set of facts on the ground,72 setting new chains of cause and effect in motion; these will reshape the world, in ways that are difficult if not impossible to reverse, long before anyone is able to secure an explanation. It’s evident that the authors of this well-intended regulation either haven’t quite understood how algorithms achieve their effects, or have failed to come up with language that might meaningfully constrain how they operate.


When Computers Can Think: The Artificial Intelligence Singularity by Anthony Berglas, William Black, Samantha Thalind, Max Scratchmann, Michelle Estes

3D printing, Abraham Maslow, AI winter, air gap, anthropic principle, artificial general intelligence, Asilomar, augmented reality, Automated Insights, autonomous vehicles, availability heuristic, backpropagation, blue-collar work, Boston Dynamics, brain emulation, call centre, cognitive bias, combinatorial explosion, computer vision, Computing Machinery and Intelligence, create, read, update, delete, cuban missile crisis, David Attenborough, DeepMind, disinformation, driverless car, Elon Musk, en.wikipedia.org, epigenetics, Ernest Rutherford, factory automation, feminist movement, finite state, Flynn Effect, friendly AI, general-purpose programming language, Google Glasses, Google X / Alphabet X, Gödel, Escher, Bach, Hans Moravec, industrial robot, Isaac Newton, job automation, John von Neumann, Law of Accelerating Returns, license plate recognition, Mahatma Gandhi, mandelbrot fractal, natural language processing, Nick Bostrom, Parkinson's law, patent troll, patient HM, pattern recognition, phenotype, ransomware, Ray Kurzweil, Recombinant DNA, self-driving car, semantic web, Silicon Valley, Singularitarianism, Skype, sorting algorithm, speech recognition, statistical model, stem cell, Stephen Hawking, Stuxnet, superintelligent machines, technological singularity, Thomas Malthus, Turing machine, Turing test, uranium enrichment, Von Neumann architecture, Watson beat the top human players on Jeopardy!, wikimedia commons, zero day

The loop invariant essentially defines the different types of sorting algorithms. One common algorithm known as the insertion sort has an invariant that the left part of the list is internally sorted, but that there may still be smaller values in the righthand part. Another algorithm known as quicksort has the invariant that all the values on the left are less than all the values on the right, but that neither side is sorted. Shell sort has an invariant that sub lists of every kth element are sorted. The details are not important; the point is that having different intermediate invariants largely defines the different sorting algorithms. The problem of (semi) automatically proving that programs are correct has been the subject of considerable research, and is far from being fully solved.

Early work by Hoare and others addresses this problem by describing the input and output of a program in terms of mathematical logic. Each step of the program is also defined in mathematical logic, so proving that the program is correct involves proving that the output follows the input given the program steps. Classic examples of this process involve sorting algorithms, whose job it is to sort a list of numbers in ascending order. It turns out there are many different ways of performing this task, but they all have the same definition of their output, namely a sorted copy of their input. The simplest, and one of the least efficient, algorithms is the selection sort.


pages: 297 words: 77,362

The Nature of Technology by W. Brian Arthur

Andrew Wiles, Boeing 747, business process, Charles Babbage, cognitive dissonance, computer age, creative destruction, double helix, endogenous growth, financial engineering, Geoffrey West, Santa Fe Institute, haute cuisine, James Watt: steam engine, joint-stock company, Joseph Schumpeter, Kenneth Arrow, Kevin Kelly, knowledge economy, locking in a profit, Mars Rover, means of production, Myron Scholes, power law, punch-card reader, railway mania, Recombinant DNA, Silicon Valley, Simon Singh, sorting algorithm, speech recognition, Stuart Kauffman, technological determinism, technological singularity, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions

Nothing equivalent to DNA or its cellular workings of course, or as beautifully ordered as this. But still, a rich interlinked ancestry. All this sounds organic—very organic—and indeed the view we will be led to is as much biological as mechanical. For sure, technologies are not biological organisms, they are mechanistic almost by definition: whether they are sorting algorithms or atomic clocks, they have parts that interact in predictable ways. But once we lay them out as combinations that can be combined into further combinations, we begin to see them not so much as individual pieces of clockwork but as complexes of working processes that interact with other complexes to form new ones.

It is a miniature extraction process, a tiny factory that these days can be held in the palm of a hand. All devices in fact process something. That, after all, is why economists refer to technologies as means of production. Does the correspondence work in the other direction? Can we view methods and processes as devices? The answer is yes. Processes and methods—think of oil refining or sorting algorithms—are sequences of operations. But to execute, they always require some hardware, some sort of physical equipment that carries out the operations. We could see this physical equipment as forming a “device” executing this sequence of operations. In the case of oil refining this would be a pretty big device.


pages: 301 words: 85,263

New Dark Age: Technology and the End of the Future by James Bridle

AI winter, Airbnb, Alfred Russel Wallace, AlphaGo, Anthropocene, Automated Insights, autonomous vehicles, back-to-the-land, Benoit Mandelbrot, Bernie Sanders, bitcoin, Boeing 747, British Empire, Brownian motion, Buckminster Fuller, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, coastline paradox / Richardson effect, cognitive bias, cognitive dissonance, combinatorial explosion, computer vision, congestion charging, cryptocurrency, data is the new oil, disinformation, Donald Trump, Douglas Engelbart, Douglas Engelbart, Douglas Hofstadter, Dr. Strangelove, drone strike, Edward Snowden, Eyjafjallajökull, Fairchild Semiconductor, fake news, fear of failure, Flash crash, fulfillment center, Google Earth, Greyball, Haber-Bosch Process, Higgs boson, hive mind, income inequality, informal economy, Internet of things, Isaac Newton, ITER tokamak, James Bridle, John von Neumann, Julian Assange, Kickstarter, Kim Stanley Robinson, Large Hadron Collider, late capitalism, Laura Poitras, Leo Hollis, lone genius, machine translation, mandelbrot fractal, meta-analysis, Minecraft, mutually assured destruction, natural language processing, Network effects, oil shock, p-value, pattern recognition, peak oil, recommendation engine, road to serfdom, Robert Mercer, Ronald Reagan, security theater, self-driving car, Seymour Hersh, Silicon Valley, Silicon Valley ideology, Skype, social graph, sorting algorithm, South China Sea, speech recognition, Spread Networks laid a new fibre optics cable between New York and Chicago, stem cell, Stuxnet, technoutopianism, the built environment, the scientific method, Uber for X, undersea cable, University of East Anglia, uranium enrichment, Vannevar Bush, warehouse robotics, WikiLeaks

At lunch one day, his father had held the button down on his iPhone a little long, resulting in a burst of images of the same scene. Smith uploaded two of them, to see which his wife preferred. In one, he was smiling, but his wife was not; in the other, his wife was smiling, but he was not. From these two images, taken seconds apart, Google’s photo-sorting algorithms had conjured a third: a composite in which both subjects were smiling their ‘best’. The algorithm was part of a package called AutoAwesome (since renamed, simply, ‘Assistant’), which performed a range of tweaks on uploaded images to make them more ‘awesome’ – applying nostalgic filters, turning them into charming animations, and so forth.

One of the first symptoms of clinical paranoia is the belief that somebody is watching you; but this belief is now a reasonable one. Every email we send; every text message we write; every phone call we make; every journey we take; each step, breath, dream, and utterance is the target of vast systems of automated intelligence gathering, the sorting algorithms of social networks and spam factories, and the sleepless gaze of our own smartphones and connected devices. So who’s paranoid now? It’s November 2014 and I’m standing on an access road in a field near Farnborough, in Hampshire, England. I’m waiting for a plane to fly overhead. I don’t know when it’s going to take off, or if it’s going to fly at all.


pages: 282 words: 85,658

Ask Your Developer: How to Harness the Power of Software Developers and Win in the 21st Century by Jeff Lawson

Airbnb, AltaVista, Amazon Web Services, barriers to entry, big data - Walmart - Pop Tarts, Big Tech, big-box store, bitcoin, business process, call centre, Chuck Templeton: OpenTable:, cloud computing, coronavirus, COVID-19, create, read, update, delete, cryptocurrency, data science, David Heinemeier Hansson, deep learning, DevOps, Elon Musk, financial independence, global pandemic, global supply chain, Hacker News, Internet of things, Jeff Bezos, Kanban, Lean Startup, loose coupling, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, microservices, minimum viable product, Mitch Kapor, move fast and break things, Paul Graham, peer-to-peer, ride hailing / ride sharing, risk tolerance, Ruby on Rails, Salesforce, side project, Silicon Valley, Silicon Valley startup, Skype, social distancing, software as a service, software is eating the world, sorting algorithm, Startup school, Steve Ballmer, Steve Jobs, Telecommunications Act of 1996, Toyota Production System, transaction costs, transfer pricing, two-pizza team, Uber and Lyft, uber lyft, ubercab, web application, Y Combinator

I took computer science classes, learning the fundamentals of how computers work, like how CPUs function, how memory works, and even the theoretical aspects of modern computation, like why binary is the chosen underpinning of computers, and more. I also started learning the basics of software development, writing sort algorithms in C and C++, loops, functions, and structures. This was all very interesting, but, as you’ve probably guessed by now, I’m more of an applied kind of person. I wanted to build more than just sort algorithms that had been built millions of times before by millions of other students. In the summer of 1997, after a couple of years of computer science courses, I landed an internship at Citysearch in Pasadena, California, nestled in the mountains just northeast of downtown Los Angeles.


pages: 573 words: 157,767

From Bacteria to Bach and Back: The Evolution of Minds by Daniel C. Dennett

Ada Lovelace, adjacent possible, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, AlphaGo, Andrew Wiles, Bayesian statistics, bioinformatics, bitcoin, Bletchley Park, Build a better mousetrap, Claude Shannon: information theory, computer age, computer vision, Computing Machinery and Intelligence, CRISPR, deep learning, disinformation, double entry bookkeeping, double helix, Douglas Hofstadter, Elon Musk, epigenetics, experimental subject, Fermat's Last Theorem, Gödel, Escher, Bach, Higgs boson, information asymmetry, information retrieval, invention of writing, Isaac Newton, iterative process, John von Neumann, language acquisition, megaproject, Menlo Park, Murray Gell-Mann, Necker cube, Norbert Wiener, pattern recognition, phenotype, Richard Feynman, Rodney Brooks, self-driving car, social intelligence, sorting algorithm, speech recognition, Stephen Hawking, Steven Pinker, strong AI, Stuart Kauffman, TED Talk, The Wealth of Nations by Adam Smith, theory of mind, Thomas Bayes, trickle-down economics, Turing machine, Turing test, Watson beat the top human players on Jeopardy!, Y2K

“Look Ma, no hands!” The structure of an intelligent agent What will happen to us? Home at last Appendix: The Background References Index LIST OF ILLUSTRATIONS Figure 1.1: Duck-rabbit Figure 1.2: Necker cube Figure 3.1: Kessler and Werner, stone circles Figure 3.2: Kessler and Werner’s stone-sorting algorithm at work Figure 4.1: Elevator operator manual page Figure 5.1: Clam rake Figure 7.1: Darwinian space Figure 7.2: Darwinian space with other dimensions Figure 7.3: Darwinian space for origin of life Figure 7.4: Darwinian space with religions Figure 7.5: Inverted Darwinian space with Darwinian phenomena at (0,0,0) and intelligent design at (1,1,1) Figure 9.2: Glossogenetic tree of all languages Figure 9.3: Selfridge’s automatic CAT Figure 13.1: Darwinian space Figure 13.2: Darwinian space of cultural evolution with intermediate phenomena Color insert following page 238 Figure 3.3: Australian termite castle Figure 3.4: Gaudí, La Sagrada Familia Figure 9.1: The Great Tree of Life Figure 12.1: Claidière et al., random patterns evolve into memorable tetrominos Figure 14.1: Complementary color afterimage PREFACE I started trying to think seriously about the evolution of the human mind when I was a graduate student in philosophy in Oxford in 1963 and knew almost nothing about either evolution or the human mind.

That might mean that this practice is an outdated vestige of prescientific thinking—and many biologists surmise as much—or it might mean that biologists have found a brilliant extension of reverse engineering into the living realm, using the thinking tools Nature has endowed us with to discover real patterns in the world that can well be called the reasons for the existence of other real patterns. To defend the latter claim, we need to take a look at how evolution itself could get going. Go forth and multiply In Darwin’s Dangerous Idea (1995), I argued that natural selection is an algorithmic process, a collection of sorting algorithms that are themselves composed of generate-and-test algorithms that exploit randomness (pseudo-randomness, chaos) in the generation phase, and some sort of mindless quality-control testing phase, with the winners advancing in the tournament by having more offspring. How does this cascade of generative processes get under way?

So we have a pretty good idea how come these phenomena exist where they do, and anybody who encountered these stone circles and concluded that there had to be a purposeful artificer behind them, an answer to the what for question, would be wrong. FIGURE 3.1: Kessler and Werner, stone circles. © Science magazine and Mark A. Kessler. FIGURE 3.2: Kessler and Werner’s stone-sorting algorithm at work. © Science magazine and Mark A. Kessler. In the abiotic world, many similar cycles occur concurrently but asynchronously, wheels within wheels within wheels, with different periods of repetition, “exploring” the space of chemical possibility. This would be a variety of parallel processing, a little bit like the mass production of industry, which makes lots of different parts in different places at different rates of production and then brings them together for assembly, except that this abiotic mass production is utterly unplanned and unmotivated.


pages: 197 words: 35,256

NumPy Cookbook by Ivan Idris

business intelligence, cloud computing, computer vision, data science, Debian, en.wikipedia.org, Eratosthenes, mandelbrot fractal, p-value, power law, sorting algorithm, statistical model, transaction costs, web application

We will review a number of techniques that are popular amongst NumPy users. Profiling with timeit timeit is a module that allows you to time pieces of code. It is part of the standard Python library. We will time the NumPy sort function with several different array sizes. The classic quicksort and merge sort algorithms have an average running time of O(nlogn); so we will try to fit our result to such a model. How to do it... We will require arrays to sort. Create arrays to sort.We will create arrays of varying sizes containing random integer values: times = numpy.array([]) for size in sizes: integers = numpy.random.random_integers(1, 10 ** 6, size) Measure time.In order to measure time, we need to create a timer and give it a function to execute and specify the relevant imports.


pages: 194 words: 36,223

Smart and Gets Things Done: Joel Spolsky's Concise Guide to Finding the Best Technical Talent by Joel Spolsky

AOL-Time Warner, Build a better mousetrap, David Heinemeier Hansson, functional programming, knowledge worker, linear programming, no silver bullet, nuclear winter, off-by-one error, Ruby on Rails, Sand Hill Road, Silicon Valley, sorting algorithm, Superbowl ad, the scientific method, type inference, unpaid internship

The Guerrilla Guide to Interviewing 111 A lot of programmers whom you might interview these days are apt to consider recursion, pointers, and even data structures to be a silly implementation detail that has been abstracted away by today’s many happy programming languages. “When was the last time you had to write a sorting algorithm?” they snicker. Still, I don’t really care. I want my ER doctor to understand anatomy, even if all she has to do is put the computerized defibrillator nodes on my chest and push the big red button, and I want programmers to know programming down to the CPU level, even if Ruby on Rails does read your mind and build a complete Web 2.0 social collaborative networking site for you with three clicks of the mouse.


pages: 1,758 words: 342,766

Code Complete (Developer Best Practices) by Steve McConnell

Ada Lovelace, Albert Einstein, Buckminster Fuller, business logic, call centre, classic study, continuous integration, data acquisition, database schema, don't repeat yourself, Donald Knuth, fault tolerance, General Magic , global macro, Grace Hopper, haute cuisine, if you see hoof prints, think horses—not zebras, index card, inventory management, iterative process, Larry Wall, loose coupling, Menlo Park, no silver bullet, off-by-one error, Perl 6, place-making, premature optimization, revision control, Sapir-Whorf hypothesis, seminal paper, slashdot, sorting algorithm, SQL injection, statistical model, Tacoma Narrows Bridge, the Cathedral and the Bazaar, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Turing machine, web application

Recursion isn't useful often, but when used judiciously it produces elegant solutions, as in this example in which a sorting algorithm makes excellent use of recursion: Example 17-5. Java Example of a Sorting Algorithm That Uses Recursion void QuickSort( int firstIndex, int lastIndex, String [] names ) { if ( lastIndex > firstIndex ) { int midPoint = Partition( firstIndex, lastIndex, names ); QuickSort( firstIndex, midPoint-1, names ); <-- 1 QuickSort( midPoint+1, lastIndex, names ) <-- 1 } } (1)Here are the recursive calls. In this case, the sorting algorithm chops an array in two and then calls itself to sort each half of the array.

Level 2: Low-Level Implementation Structures Low-level implementation structures are slightly higher-level structures than those provided by the language itself. They tend to be the operations and data types you learn about in college courses in algorithms and data types: stacks, queues, linked lists, trees, indexed files, sequential files, sort algorithms, search algorithms, and so on. If your program consists entirely of code written at this level, you'll be awash in too much detail to win the battle against complexity. Level 3: Low-Level Problem-Domain Terms At this level, you have the primitives you need to work in terms of the problem domain.


pages: 725 words: 168,262

API Design Patterns by Jj Geewax

Amazon Web Services, anti-pattern, bitcoin, blockchain, business logic, cognitive load, continuous integration, COVID-19, database schema, en.wikipedia.org, exponential backoff, imposter syndrome, Internet of things, Kubernetes, lateral thinking, loose coupling, machine readable, microservices, natural language processing, Paradox of Choice, ride hailing / ride sharing, social graph, sorting algorithm

We analyzed run-time and space complexity using Big-O notation, learned how a variety of sorting algorithms worked, and explored the different ways of traversing binary trees. So as you might imagine, after graduating, I expected my day job to be primarily scientific and mathematical in nature. But imagine my surprise when I found that wasn’t the case at all. It turned out that most of the work I had to do was more about design, structure, and aesthetics than about mathematics and algorithms. I never needed to think about which sorting algorithm to use because there was a library for that (usually something like array.sort()).


pages: 259 words: 67,456

The Mythical Man-Month by Brooks, Jr. Frederick P.

Boeing 747, Conway's law, finite state, HyperCard, Ken Thompson, machine readable, Menlo Park, Multics, no silver bullet, seminal paper, sorting algorithm, speech recognition, Steve Jobs, Strategic Defense Initiative, Tacoma Narrows Bridge, Turing machine, work culture

Then he can begin designing module boundaries, table structures, pass or phase breakdowns, algorithms, and all kinds of tools. Some time, too, must be spent in communicating with the architect. Meanwhile, on the realization level there is much to be done also. Programming has a technology, too. If the machine is a new one, much work must be done on subroutine conventions, supervisory techniques, searching and sorting algorithms.7 Conceptual integrity does require that a system reflect a single philosophy and that the specification as seen by the user flow from a few minds. Because of the real division of labor into architecture, implementation, and realization, however, this does not imply that a system so designed will take longer to build.


Big Data at Work: Dispelling the Myths, Uncovering the Opportunities by Thomas H. Davenport

Automated Insights, autonomous vehicles, bioinformatics, business intelligence, business process, call centre, chief data officer, cloud computing, commoditize, data acquisition, data science, disruptive innovation, Edward Snowden, Erik Brynjolfsson, intermodal, Internet of things, Jeff Bezos, knowledge worker, lifelogging, Mark Zuckerberg, move fast and break things, Narrative Science, natural language processing, Netflix Prize, New Journalism, recommendation engine, RFID, self-driving car, sentiment analysis, Silicon Valley, smart grid, smart meter, social graph, sorting algorithm, statistical model, Tesla Model S, text mining, Thomas Davenport, three-martini lunch

In a blog post (with some ­analytical jargon you can toss around at cocktail parties), he described the difference between vertical and horizontal data scientists: • Vertical data scientists have very deep knowledge in some ­narrow field. They might be computer scientists very ­familiar with computational complexity of all sorting ­algorithms. Or a statistician who knows everything about ­eigenvalues, singular value decomposition and its numerical stability, and asymptotic convergence of maximum ­pseudo-likelihood estimators. Or a software engineer with years of experience writing Python code (including graphic libraries) applied to API development and web crawling technology.


pages: 757 words: 193,541

The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2 by Thomas A. Limoncelli, Strata R. Chalup, Christina J. Hogan

active measures, Amazon Web Services, anti-pattern, barriers to entry, business process, cloud computing, commoditize, continuous integration, correlation coefficient, database schema, Debian, defense in depth, delayed gratification, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, finite state, Firefox, functional programming, Google Glasses, information asymmetry, Infrastructure as a Service, intermodal, Internet of things, job automation, job satisfaction, Ken Thompson, Kickstarter, level 1 cache, load shedding, longitudinal study, loose coupling, machine readable, Malcom McLean invented shipping containers, Marc Andreessen, place-making, platform as a service, premature optimization, recommendation engine, revision control, risk tolerance, Salesforce, scientific management, seminal paper, side project, Silicon Valley, software as a service, sorting algorithm, standardized shipping container, statistical model, Steven Levy, supply-chain management, systems thinking, The future is already here, Toyota Production System, vertical integration, web application, Yogi Berra

The run-time of a system might be described as growing in similar terms. Super-linear systems sound awful compared to sub-linear systems. Why not always use algorithms that are constant or linear? The simplest reason is that often algorithms of that order don’t exist. Sorting algorithms have to touch every item at least once, eliminating the possibility of O(1) algorithms. There is one O(n) sort algorithm but it works on only certain kinds of data. Another reason is that faster algorithms often require additional work ahead of time: for example, building an index makes future searches faster but requires the overhead and complexity of building and maintaining the index.


pages: 1,331 words: 183,137

Programming Rust: Fast, Safe Systems Development by Jim Blandy, Jason Orendorff

bioinformatics, bitcoin, Donald Knuth, duck typing, Elon Musk, Firefox, fizzbuzz, functional programming, mandelbrot fractal, Morris worm, MVC pattern, natural language processing, reproducible builds, side project, sorting algorithm, speech recognition, Turing test, type inference, WebSocket

Only ranges that include a start value are iterable, since a loop must have somewhere to start. But in array slicing, all four forms are useful. If the start or end of the range is omitted, it defaults to the start or end of the data being sliced. So an implementation of quicksort, the classic divide-and-conquer sorting algorithm, might look, in part, like this: fn quicksort<T: Ord>(slice: &mut [T]) { if slice.len() <= 1 { return; // Nothing to sort. } // Partition the slice into two parts, front and back. let pivot_index = partition(slice); // Recursively sort the front half of `slice`. quicksort(&mut slice[.. pivot_index]); // And the back half. quicksort(&mut slice[pivot_index + 1 ..]); } Reference Operators The address-of operators, & and &mut, are covered in Chapter 5.

For example, consider this JavaScript code: // Start an animation that rearranges the rows in a table of cities. function startSortingAnimation(cities, stat) { // Helper function that we'll use to sort the table. // Note that this function refers to stat. function keyfn(city) { return city.get_statistic(stat); } if (pendingSort) pendingSort.cancel(); // Now kick off an animation, passing keyfn to it. // The sorting algorithm will call keyfn later. pendingSort = new SortingAnimation(cities, keyfn); } The closure keyfn is stored in the new SortingAnimation object. It’s meant to be called after startSortingAnimation returns. Now, normally when a function returns, all its variables and arguments go out of scope and are discarded.


The Art of Computer Programming: Fundamental Algorithms by Donald E. Knuth

Charles Babbage, discrete time, distributed generation, Donald Knuth, fear of failure, Fermat's Last Theorem, G4S, Gerard Salton, Isaac Newton, Ivan Sutherland, Jacquard loom, Johannes Kepler, John von Neumann, linear programming, linked data, Menlo Park, probability theory / Blaise Pascal / Pierre de Fermat, sorting algorithm, stochastic process, Turing machine

But a running time of O(n2) does not necessarily imply that the running time is not also O(n). There's another notation, Big Omega, for lower bounds: The statement g(n) = n(f(n)) B0) means that there are positive constants L and no such that \g(ri)\ > L\f(n)\ for all n > n0. Using this notation we can correctly conclude that a sorting algorithm whose running time is Q(n2) will not be as efficient as one whose running time is O(nlogn), if n is large enough. However, without knowing the constant factors implied by O and Q,, we cannot say anything about how large n must be before the O(n log n) method will begin to win. Finally, if we want to state an exact order of growth without being precise about constant factors, we can use Big Theta notation: g(n) = e(f(n)) <=> g(n) = O(f(n)) and g(n) = Q(f(n)).

22. [23] Program T assumes that its input tape contains valid information, but a program that is intended for general use should always make careful tests on its input so that clerical errors can be detected, and so that the program cannot "destroy itself." For example, if one of the input relations for k were negative, Program T may erroneously change one of its own instructions when storing into X [A;]. Suggest ways to modify Program T so that it is suitable for general use. 2.2.3 LINKED ALLOCATION 271 > 23. [27] When the topological sort algorithm cannot proceed because it has detected a loop in the input (see step T8), it is usually of no use to stop and say, "There was a loop." It is helpful to print out one of the loops, thereby showing part of the input that was in error. Extend Algorithm T so that it will do this additional printing of a loop when necessary.

(See also exercise 1.2.3-36.) 13. [15] What oriented tree on the vertices {1, 2,..., 10} has the canonical represen- representation 3, 1, 4, 1, 5, 9, 2, 6, 5? 14. [10] True or false: The last entry, /(Vn_i), in the canonical representation of an oriented tree is always the root of that tree. 15. [21] Discuss the relationships that exist (if any) between the topological sort algorithm of Section 2.2.3 and the canonical representation of an oriented tree. 16. [25] Design an algorithm (as efficient as possible) that converts from the canonical representation of an oriented tree to a conventional computer representation using PARENT links. > 17. [M26] Let f(x) be an integer-valued function, where 1 < f(x) < m for all integers 1 < x < m.


pages: 931 words: 79,142

Concepts, Techniques, and Models of Computer Programming by Peter Van-Roy, Seif Haridi

computer age, Debian, discrete time, Donald Knuth, Eratosthenes, fault tolerance, functional programming, G4S, general-purpose programming language, George Santayana, John von Neumann, Lao Tzu, Menlo Park, natural language processing, NP-complete, Paul Graham, premature optimization, sorting algorithm, the Cathedral and the Bazaar, Therac-25, Turing complete, Turing machine, type inference

Answer the following two questions: (a) What happens if you delete an element from an empty queue? (b) Why is it wrong to define IsEmpty as follows? fun {IsEmpty q(N S E)} S==E end 15. Quicksort. The following is a possible algorithm for sorting lists. Its inventor, C.A.R. Hoare, called it quicksort, because it was the fastest known general-purpose sorting algorithm at the time it was invented. It uses a divide and conquer strategy to give an average time complexity of O(n log n). Here is an informal description of the algorithm for the declarative model. Given an input list L, then do the following operations: (a) Pick L’s first element, X, to use as a pivot.

More romantically, it is sometimes said that the component has a “secret.” 6.7 Program design in the large 459 in practice, interface changes cannot be avoided during the design of a component. All we can do is minimize their frequency. This means that the interfaces of oftenneeded components should be designed as carefully as possible from the start. Let us give a simple example to make this clear. Consider a component that sorts lists of character strings. It can change its sorting algorithm without changing its interface. This can often be done without recompiling the rest of the program, simply by linking in the new component. On the other hand, if the character format is changed, then the component might require a different interface. For example, characters can change size from one to two bytes (if ASCII is replaced with Unicode).

A naive program might consist of axioms defining a permutation of a list and a query that states that there exists a permutation whose elements are in ascending order. Such a program would be short but inefficient. Much more efficient would be to write axioms that express the properties of an efficient sorting algorithm, such as mergesort. A major achievement of computer science is that practical logic programming systems have been built by combining these two approaches. The first popular language to achieve this goal was Prolog; it was subsequently followed by many other languages. High-performance Prolog implementations are amazingly fast; they can even rival the speed of imperative language implementations [216]. 9.3.2 Operational and logical semantics There are two ways to look at a logic program: the logical view and the operational view.


pages: 260 words: 77,007

Are You Smart Enough to Work at Google?: Trick Questions, Zen-Like Riddles, Insanely Difficult Puzzles, and Other Devious Interviewing Techniques You ... Know to Get a Job Anywhere in the New Economy by William Poundstone

affirmative action, Albert Einstein, big-box store, Buckminster Fuller, car-free, cloud computing, creative destruction, digital rights, en.wikipedia.org, full text search, hiring and firing, How many piano tuners are there in Chicago?, index card, Isaac Newton, Johannes Kepler, John von Neumann, lateral thinking, loss aversion, mental accounting, Monty Hall problem, new economy, off-the-grid, Paul Erdős, RAND corporation, random walk, Richard Feynman, rolodex, Rubik’s Cube, Silicon Valley, Silicon Valley startup, sorting algorithm, Steve Ballmer, Steve Jobs, The Spirit Level, Tony Hsieh, why are manhole covers round?, William Shockley: the traitorous eight

Tyma posed this question to his mother: See Tyma’s blog post at http://paultyma.blogspot.com/2007/03/howto-pass-silicon-valley-software.html. about twenty times faster than quicksort: With 1,000,000 records to sort, Mrs. Tyma’s method requires 1,000,000 operations. Quicksort, and other optimal sorting algorithms, require on the order of 1,000,000 log2 (1,000,000) operations. Taking this at face value, Mrs. Tyma’s method is approximately log2 (1,000,000), or 19.9 +, times faster. Chapter Six “At Google, we believe in collaboration”: Mohammad, blog post at http://allouh.wordpress.com/2009/04/14/interview-with-google/.


pages: 250 words: 73,574

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers by John MacCormick, Chris Bishop

Ada Lovelace, AltaVista, Charles Babbage, Claude Shannon: information theory, Computing Machinery and Intelligence, fault tolerance, information retrieval, Menlo Park, PageRank, pattern recognition, Richard Feynman, Silicon Valley, Simon Singh, sorting algorithm, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, traveling salesman, Turing machine, Turing test, Vannevar Bush

The first criterion—everyday use by ordinary computer users—eliminates algorithms used primarily by computer professionals, such as compilers and program verification techniques. The second criterion—concrete application to a specific problem—eliminates many of the great algorithms that are central to the undergraduate computer science curriculum. This includes sorting algorithms like quicksort, graph algorithms such as Dijkstra's shortest-path algorithm, and data structures such as hash tables. These algorithms are indisputably great and they easily meet the first criterion, since most application programs run by ordinary users employ them repeatedly. But these algorithms are generic: they can be applied to a vast array of different problems.


pages: 231 words: 71,248

Shipping Greatness by Chris Vander Mey

business logic, corporate raider, don't be evil, en.wikipedia.org, fudge factor, Google Chrome, Google Hangouts, Gordon Gekko, Jeff Bezos, Kickstarter, Lean Startup, minimum viable product, performance metric, recommendation engine, Skype, slashdot, sorting algorithm, source of truth, SQL injection, Steve Jobs, Superbowl ad, two-pizza team, web application

You may want to provide less detail than I show in this API. But if you’re building anything developer facing, even if those developers are other engineers within your business, it’s worth considering digging deep so that you don’t expose a problem later. For example, if Reviews were assuming we’d give the content a letter grade, its sort algorithm might be completely different! Step 6. Write the Functional Specifications Document You’ve crafted a product idea that solves a real need for a real group of customers. You’ve pitched and received buy-in from your essential stakeholders. You’ve worked with your dependencies to define how you’ll interface.


pages: 999 words: 194,942

Clojure Programming by Chas Emerick, Brian Carper, Christophe Grand

Amazon Web Services, Benoit Mandelbrot, cloud computing, cognitive load, continuous integration, database schema, domain-specific language, don't repeat yourself, drop ship, duck typing, en.wikipedia.org, failed state, finite state, Firefox, functional programming, game design, general-purpose programming language, Guido van Rossum, higher-order functions, Larry Wall, mandelbrot fractal, no silver bullet, Paul Graham, platform as a service, premature optimization, random walk, Ruby on Rails, Schrödinger's Cat, semantic web, software as a service, sorting algorithm, SQL injection, Turing complete, type inference, web application

[356] Note that you can absolutely opt to use “legacy” dependency injection containers (like Spring, Guice, et al.) with Clojure types and records. Strategy Pattern Another common design pattern is the strategy pattern. This pattern allows the selection of a method or algorithm dynamically. Suppose we want to select a sorting algorithm at runtime: interface ISorter { public sort (int[] numbers); } class QuickSort implements ISorter { public sort (int[] numbers) { ... } } class MergeSort implements ISorter { public sort (int[] numbers) { ... } } class Sorter { private ISorter sorter; public Sorter (ISorter sorter) { this.sorter = sorter; } public execute (int[] numbers) { sorter.sort(numbers); } } class App { public ISorter chooseSorter () { if (...) { return new QuickSort(); } else { return new MergeSort(); } } public static void main(String[] args) { int[] numbers = {5,1,4,2,3}; Sorter s = new Sorter(chooseSorter()); s.execute(numbers); //... now use sorted numbers } } Clojure has a very simple advantage over Java in this case.

(defn choose-sorter [] (if ... quicksort mergesort)) (defn main [] (let [numbers [...]] ((choose-sorter) numbers))) There are no classes in sight. Each function implementing the semantics of our algorithm can be called directly, without class definitions getting in the way of our specifying behavior. You don’t even have to give your sorting algorithm a name—an anonymous function works just as well. For example, the composition of Clojure’s built-in sort function and then reverse to reverse the sort order is an anonymous function: ((comp reverse sort) [2 1 3]) ;= (3 2 1) Chain of Responsibility While Clojure’s facilities make many patterns unnecessary or invisible, a select few remain relevant and continue to impact our design and implementation of Clojure programs.


pages: 336 words: 88,320

Being Geek: The Software Developer's Career Handbook by Michael Lopp

do what you love, finite state, game design, job satisfaction, John Gruber, knowledge worker, reality distortion field, remote working, rolodex, Silicon Valley, Silicon Valley startup, Skype, sorting algorithm, systems thinking, web application

I can't think of the last time I worried about which version of a document I was on, and that means I'm spending more time working than worrying. My Tools Are Designed to Remove Repetitive Motion One of my first algorithmic holy shits was during my second computer science class as we were learning sorting algorithms. The professor elegantly walked us through the construction of different algorithms, explaining the pros and the cons, and then he landed Quicksort. Holy shit. It wasn't just the elegance. It wasn't the recursive simplicity; it was the discovery that with imagination there were approaches that were wildly more efficient—and simpler.


Learn Algorithmic Trading by Sebastien Donadio

active measures, algorithmic trading, automated trading system, backtesting, Bayesian statistics, behavioural economics, buy and hold, buy low sell high, cryptocurrency, data science, deep learning, DevOps, en.wikipedia.org, fixed income, Flash crash, Guido van Rossum, latency arbitrage, locking in a profit, market fundamentalism, market microstructure, martingale, natural language processing, OpenAI, p-value, paper trading, performance metric, prediction markets, proprietary trading, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, sorting algorithm, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, type inference, WebSocket, zero-sum game

We will write the following function to test the creation of the book event after the top of the book changes: def test_generate_book_event(self): order1 = { 'id': 1, 'price': 219, 'quantity': 10, 'side': 'bid', 'action': 'new' } ob_for_aapl = self.reforderbook self.assertEqual(ob_for_aapl.handle_order(order1), {'bid_price': 219, 'bid_quantity': 10, 'offer_price': -1, 'offer_quantity': -1}) order2 = order1.copy() order2['id'] = 2 order2['price'] = 220 order2['side'] = 'ask' self.assertEqual(ob_for_aapl.handle_order(order2), {'bid_price': 219, 'bid_quantity': 10, 'offer_price': 220, 'offer_quantity': 10}) if __name__ == '__main__': unittest.main() In this section, we studied how to build a limit order book. This was a naive implementation. The complexity to add an order is in the order of O(N) and for each insertion, we use a sorting algorithm with a complexity of O(N log N). In order to get a book working faster for order insertion, order lookup, we should use more advanced data structures, as described in Algorithm Analysis, Packt Publishing. Because we need to sort the order by price, we need to use an ordered data structure, such as trees.


pages: 1,082 words: 87,792

Python for Algorithmic Trading: From Idea to Cloud Deployment by Yves Hilpisch

algorithmic trading, Amazon Web Services, automated trading system, backtesting, barriers to entry, bitcoin, Brownian motion, cloud computing, coronavirus, cryptocurrency, data science, deep learning, Edward Thorp, fiat currency, global macro, Gordon Gekko, Guido van Rossum, implied volatility, information retrieval, margin call, market microstructure, Myron Scholes, natural language processing, paper trading, passive investing, popular electronics, prediction markets, quantitative trading / quantitative finance, random walk, risk free rate, risk/return, Rubik’s Cube, seminal paper, Sharpe ratio, short selling, sorting algorithm, systematic trading, transaction costs, value at risk

This is a realistic setting for financial trading algorithms for which any element of (perfect) foresight is to be excluded. By contrast, an offline algorithm knows the complete data set from the beginning. Many algorithms in computer science fall into the category of offline algorithms, such as a sorting algorithm over a list of numbers. To generate signals in real time on the basis of an online algorithm, data needs to be collected and processed over time. Consider, for example, a trading strategy based on the time series momentum of the last three five-second intervals (see Chapter 4). Tick data needs to be collected and then resampled, and the momentum needs to be calculated based on the resampled data set.


pages: 332 words: 93,672

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy by George Gilder

23andMe, Airbnb, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, AlphaGo, AltaVista, Amazon Web Services, AOL-Time Warner, Asilomar, augmented reality, Ben Horowitz, bitcoin, Bitcoin Ponzi scheme, Bletchley Park, blockchain, Bob Noyce, British Empire, Brownian motion, Burning Man, business process, butterfly effect, carbon footprint, cellular automata, Claude Shannon: information theory, Clayton Christensen, cloud computing, computer age, computer vision, crony capitalism, cross-subsidies, cryptocurrency, Danny Hillis, decentralized internet, deep learning, DeepMind, Demis Hassabis, disintermediation, distributed ledger, don't be evil, Donald Knuth, Donald Trump, double entry bookkeeping, driverless car, Elon Musk, Erik Brynjolfsson, Ethereum, ethereum blockchain, fake news, fault tolerance, fiat currency, Firefox, first square of the chessboard, first square of the chessboard / second half of the chessboard, floating exchange rates, Fractional reserve banking, game design, Geoffrey Hinton, George Gilder, Google Earth, Google Glasses, Google Hangouts, index fund, inflation targeting, informal economy, initial coin offering, Internet of things, Isaac Newton, iterative process, Jaron Lanier, Jeff Bezos, Jim Simons, Joan Didion, John Markoff, John von Neumann, Julian Assange, Kevin Kelly, Law of Accelerating Returns, machine translation, Marc Andreessen, Mark Zuckerberg, Mary Meeker, means of production, Menlo Park, Metcalfe’s law, Money creation, money: store of value / unit of account / medium of exchange, move fast and break things, Neal Stephenson, Network effects, new economy, Nick Bostrom, Norbert Wiener, Oculus Rift, OSI model, PageRank, pattern recognition, Paul Graham, peer-to-peer, Peter Thiel, Ponzi scheme, prediction markets, quantitative easing, random walk, ransomware, Ray Kurzweil, reality distortion field, Recombinant DNA, Renaissance Technologies, Robert Mercer, Robert Metcalfe, Ronald Coase, Ross Ulbricht, Ruby on Rails, Sand Hill Road, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Singularitarianism, Skype, smart contracts, Snapchat, Snow Crash, software is eating the world, sorting algorithm, South Sea Bubble, speech recognition, Stephen Hawking, Steve Jobs, Steven Levy, Stewart Brand, stochastic process, Susan Wojcicki, TED Talk, telepresence, Tesla Model S, The Soul of a New Machine, theory of mind, Tim Cook: Apple, transaction costs, tulip mania, Turing complete, Turing machine, Vernor Vinge, Vitalik Buterin, Von Neumann architecture, Watson beat the top human players on Jeopardy!, WikiLeaks, Y Combinator, zero-sum game

At the time that would have been Sun with its SPARCstations, its RISC (reduced instruction set computer), its Java virtual machines, its Solaris OS—all competing with the rising leviathan Microsoft and the still ascendant IBM. No, Schmidt wrote in his midnight email, the profits would flow to “the companies with the best networks and the best search and sort algorithms.”7 This insight I dubbed Schmidt’s Law. Schmidt was not just a midnight email doodler. He soon left Sun and, after a stint as CEO of Novell trying to build the best networks and search engines in Utah, he joined Google and soon became CEO. There he found himself engulfed by the future he had predicted.


Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

Knowledge, Truth, and Lies | 307 For modeling real systems, the partially synchronous model with crash-recovery faults is generally the most useful model. But how do distributed algorithms cope with that model? Correctness of an algorithm To define what it means for an algorithm to be correct, we can describe its properties. For example, the output of a sorting algorithm has the property that for any two dis‐ tinct elements of the output list, the element further to the left is smaller than the ele‐ ment further to the right. That is simply a formal way of defining what it means for a list to be sorted. Similarly, we can write down the properties we want of a distributed algorithm to define what it means to be correct.

To ensure that all key-value pairs with the same key end up at the same reducer, the framework uses a hash of the key to determine which reduce task should receive a particular key-value pair (see “Partitioning by Hash of Key” on page 203). The key-value pairs must be sorted, but the dataset is likely too large to be sorted with a conventional sorting algorithm on a single machine. Instead, the sorting is per‐ formed in stages. First, each map task partitions its output by reducer, based on the hash of the key. Each of these partitions is written to a sorted file on the mapper’s local disk, using a technique similar to what we discussed in “SSTables and LSMTrees” on page 76.


pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

For modeling real systems, the partially synchronous model with crash-recovery faults is generally the most useful model. But how do distributed algorithms cope with that model? Correctness of an algorithm To define what it means for an algorithm to be correct, we can describe its properties. For example, the output of a sorting algorithm has the property that for any two distinct elements of the output list, the element further to the left is smaller than the element further to the right. That is simply a formal way of defining what it means for a list to be sorted. Similarly, we can write down the properties we want of a distributed algorithm to define what it means to be correct.

To ensure that all key-value pairs with the same key end up at the same reducer, the framework uses a hash of the key to determine which reduce task should receive a particular key-value pair (see “Partitioning by Hash of Key”). The key-value pairs must be sorted, but the dataset is likely too large to be sorted with a conventional sorting algorithm on a single machine. Instead, the sorting is performed in stages. First, each map task partitions its output by reducer, based on the hash of the key. Each of these partitions is written to a sorted file on the mapper’s local disk, using a technique similar to what we discussed in “SSTables and LSM-Trees”.


Refactoring: Improving the Design of Existing Code by Martin Fowler, Kent Beck, John Brant, William Opdyke, Don Roberts

business logic, conceptual framework, database schema, index card, MVC pattern, no silver bullet, place-making, sorting algorithm

If you have to redundantly move code to accomplish this, you may be able to use Extract Superclass to atone. Incomplete Library Class Reuse is often touted as the purpose of objects. We think reuse is overrated (we just use). However, we can't deny that much of our programming skill is based on library classes so that nobody can tell whether we've forgotten our sort algorithms. Builders of library classes are rarely omniscient. We don't blame them for that; after all, we can rarely figure out a design until we've mostly built it, so library builders have a really tough job. The trouble is that it is often bad form, and usually impossible, to modify a library class to do something you'd like it to do.


pages: 359 words: 96,019

How to Turn Down a Billion Dollars: The Snapchat Story by Billy Gallagher

Airbnb, Albert Einstein, Amazon Web Services, AOL-Time Warner, Apple's 1984 Super Bowl advert, augmented reality, Bernie Sanders, Big Tech, Black Swan, citizen journalism, Clayton Christensen, computer vision, data science, disruptive innovation, Donald Trump, El Camino Real, Elon Musk, fail fast, Fairchild Semiconductor, Frank Gehry, gamification, gentrification, Google Glasses, Hyperloop, information asymmetry, Jeff Bezos, Justin.tv, Kevin Roose, Lean Startup, Long Term Capital Management, Mark Zuckerberg, Menlo Park, minimum viable product, Nelson Mandela, Oculus Rift, paypal mafia, Peter Thiel, power law, QR code, Robinhood: mobile stock trading app, Salesforce, Sand Hill Road, Saturday Night Live, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, skeuomorphism, Snapchat, social graph, SoftBank, sorting algorithm, speech recognition, stealth mode startup, Steve Jobs, TechCrunch disrupt, too big to fail, value engineering, Y Combinator, young professional

Evan, having studied journalism at Crossroads, was obsessed with Stories being able to show a narrative, as users could really tell rich stories mixing photo and video rather than just posting a couple photos from their night out. Content would appear for twenty-four hours then disappear. If a friend’s story was dull or too long and dragged on, users could just tap it to skip ahead. With no likes and comments, there was no need for a sorting algorithm, like those found on Facebook, Instagram, and Twitter. Content would just appear based on who had posted most recently—an added incentive to post more often. Without likes and comments, Snapchat’s new feature needed a different way to reward users for posting and capture its users’ psychological craving for attention.


Beautiful Visualization by Julie Steele

barriers to entry, correlation does not imply causation, data acquisition, data science, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta-analysis, natural language processing, Netflix Prize, no-fly zone, pattern recognition, peer-to-peer, performance metric, power law, QR code, recommendation engine, semantic web, social bookmarking, social distancing, social graph, sorting algorithm, Steve Jobs, the long tail, web application, wikimedia commons, Yochai Benkler

In no study was animation found to outperform rich static diagrams. It did beat out textual representations, though, and simple representations that simply showed start and end state without transitions. Algorithm animation is in many ways similar to process visualization: an algorithm can be illustrated by showing the steps that it takes. Some sort algorithms, for example, are very amenable to animation: an array of values can be drawn as a sequence of bars, so the sort operations move bars around. These animations can easily show the differences between, say, a bubble sort and an insertion sort. Christopher Hundhausen, Sarah Douglas, and John Stasko (2002) tried to understand the effectiveness of algorithm visualization in the classroom, but half of the controlled studies they examined found that animation did not help students understand algorithms.


pages: 428 words: 103,544

The Data Detective: Ten Easy Rules to Make Sense of Statistics by Tim Harford

Abraham Wald, access to a mobile phone, Ada Lovelace, affirmative action, algorithmic bias, Automated Insights, banking crisis, basic income, behavioural economics, Black Lives Matter, Black Swan, Bretton Woods, British Empire, business cycle, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, Charles Babbage, clean water, collapse of Lehman Brothers, contact tracing, coronavirus, correlation does not imply causation, COVID-19, cuban missile crisis, Daniel Kahneman / Amos Tversky, data science, David Attenborough, Diane Coyle, disinformation, Donald Trump, Estimating the Reproducibility of Psychological Science, experimental subject, fake news, financial innovation, Florence Nightingale: pie chart, Gini coefficient, Great Leap Forward, Hans Rosling, high-speed rail, income inequality, Isaac Newton, Jeremy Corbyn, job automation, Kickstarter, life extension, meta-analysis, microcredit, Milgram experiment, moral panic, Netflix Prize, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, opioid epidemic / opioid crisis, Paul Samuelson, Phillips curve, publication bias, publish or perish, random walk, randomized controlled trial, recommendation engine, replication crisis, Richard Feynman, Richard Thaler, rolodex, Ronald Reagan, selection bias, sentiment analysis, Silicon Valley, sorting algorithm, sparse data, statistical model, stem cell, Stephen Hawking, Steve Bannon, Steven Pinker, survivorship bias, systematic bias, TED Talk, universal basic income, W. E. B. Du Bois, When a measure becomes a target

See health and medical data public opinion, 149, 220 public transportation, 47–49 publication bias, 113–16, 118–23, 125–27 publicity, 107 Puerto Rico, 197–98, 200 Puy de Dôme, France, 172 Quetelet, Adolphe, 219 racial data, 176–79, 206 Random Walk down Wall Street, A (Malkiel), 125 randomized clinical trials, 4n, 53, 125–26, 133, 180 randomness, 123–24 Rapid Safety Feedback, 170–71 Rayner, Derek, 205–8 Reaper Man (Pratchett), 87 recessions, 11 recommendation engines, 181 record-keeping practices, 220–21 refugees, 191 Reifler, Jason, 129 Reischauer, Robert, 187 Reiter, Jonathan, 108 reliability of data, 233–37 religious authority, 16 religious beliefs, 247–48 Remington Rand, 244 replication/reproducibility studies and problems, 107, 112–16, 120–22, 129–31 Republican Party, 34, 189n, 269, 270 résumé-sorting algorithms, 166 ridership data, 49–51 Riecken, Henry, 239 risk models, 71 Rivlin, Alice, 186–87, 188, 212 Robinson, Nicholas, 168, 169–70 Roman Catholic Church, 16 Rönnlund, Anna Rosling, 62, 63 Roosevelt, Franklin Delano, 143–44 rose diagrams, 215–16, 233–36, 234 Roser, Max, 89, 96 Rosling, Hans, 63, 185 Ross, Lee, 35 Royal Naval Reserve, 218 Royal Society, 13 Royal Statistical Society, 194, 214, 219, 233 Rozenblit, Leonid, 272 Ruge, Mari, 89 sampling techniques, 135–38, 142–51, 155 Samuelson, Paul, 239 sanitation advocacy, 225–26, 233–37 Santos, Alexander, 198 Say It with Charts (Zelazny), 228 scale of statistical data, 92, 93–95, 103 Scarr, Simon, 231–32 Schachter, Stanley, 239 Scheibehenne, Benjamin, 106, 111, 114, 120–21 Scientific American, 102 scientific curiosity, 268–69 scientific literacy, 34–35 scientific method, 173 Scott, James C., 201, 203 Scott Brown, Denise, 217 screen-use studies, 117–18 Scutari (Üsküdar, Istanbul) barracks hospital, 213–14, 220, 225, 233, 235 search algorithms, 156–57 Second World War, 4, 262 secrecy, 174–75 Seehofer, Horst, 191 Seeing Like a State (Scott), 201, 203 selection bias, 2, 245–46.


Reset by Ronald J. Deibert

23andMe, active measures, air gap, Airbnb, Amazon Web Services, Anthropocene, augmented reality, availability heuristic, behavioural economics, Bellingcat, Big Tech, bitcoin, blockchain, blood diamond, Brexit referendum, Buckminster Fuller, business intelligence, Cal Newport, call centre, Cambridge Analytica, carbon footprint, cashless society, Citizen Lab, clean water, cloud computing, computer vision, confounding variable, contact tracing, contact tracing app, content marketing, coronavirus, corporate social responsibility, COVID-19, crowdsourcing, data acquisition, data is the new oil, decarbonisation, deep learning, deepfake, Deng Xiaoping, disinformation, Donald Trump, Doomsday Clock, dual-use technology, Edward Snowden, Elon Musk, en.wikipedia.org, end-to-end encryption, Evgeny Morozov, failed state, fake news, Future Shock, game design, gig economy, global pandemic, global supply chain, global village, Google Hangouts, Great Leap Forward, high-speed rail, income inequality, information retrieval, information security, Internet of things, Jaron Lanier, Jeff Bezos, John Markoff, Lewis Mumford, liberal capitalism, license plate recognition, lockdown, longitudinal study, Mark Zuckerberg, Marshall McLuhan, mass immigration, megastructure, meta-analysis, military-industrial complex, move fast and break things, Naomi Klein, natural language processing, New Journalism, NSO Group, off-the-grid, Peter Thiel, planetary scale, planned obsolescence, post-truth, proprietary trading, QAnon, ransomware, Robert Mercer, Sheryl Sandberg, Shoshana Zuboff, Silicon Valley, single source of truth, Skype, Snapchat, social distancing, sorting algorithm, source of truth, sovereign wealth fund, sparse data, speech recognition, Steve Bannon, Steve Jobs, Stuxnet, surveillance capitalism, techlash, technological solutionism, the long tail, the medium is the message, The Structural Transformation of the Public Sphere, TikTok, TSMC, undersea cable, unit 8200, Vannevar Bush, WikiLeaks, zero day, zero-sum game

But caked on top of them is a bewildering array of new applications, features, and devices.14 Weaving through it all are rivers of data, some neatly contained in proper channels, others pouring through the cracks and crevices and spilling out in the form of data breaches. The phrase “data is the new oil” refers to the value to be gained from all the data that is routinely harvested by machines from both humans and other machines — the entire complex bristling with millions of pulsating data-sorting algorithms and sensors. The gradual rollout of fifth-generation cellular technology, known as 5G, will dramatically increase the speed and broadband capacity of cellular networks, fuelling an even greater volume of data circulating among a larger number of networked devices. The combined effect of each of us turning the most intimate aspects of our digital lives inside out has created a new emergent property on a planetary scale that has taken a life of its own — derived from but separate from us, a datasphere.


pages: 423 words: 21,637

On Lisp: Advanced Techniques for Common Lisp by Paul Graham

Donald Knuth, functional programming, G4S, L Peter Deutsch, Paul Graham, sorting algorithm, Turing machine

lst)) (A B) This is a rather trivial example, but bigger programs can be constructed on the same principle. Whenever we want to program by combining partial solutions, Prolog may be useful. Indeed, a surprising variety of problems can be expressed in such terms: Figure 24.14, for example, shows a sorting algorithm expressed as a collection of constraints on the solution. 24.4 The Need for Nondeterminism Chapter 22 explained the relation between deterministic and nondeterministic search. A deterministic search program could take a query and generate all the solutions which satisfied it. A nondeterministic search program will use choose to generate solutions one at a time, and if more are needed, will call fail to restart the search.


pages: 404 words: 43,442

The Art of R Programming by Norman Matloff

data science, Debian, discrete time, Donald Knuth, functional programming, general-purpose programming language, linked data, sorting algorithm, statistical model

Both of the example applications we’ve looked at here would be considered embarrassingly parallel. Parallelizing the for i loop for the mutual outlinks problem in Section 16.1 was pretty obvious. Partitioning the work in the KMC example in Section 16.2.4 was also natural and easy. By contrast, most parallel sorting algorithms require a great deal of interaction. For instance, consider merge sort, a common method of sorting numbers. It breaks the vector to be sorted into two (or more) independent parts, say the left half and right half, which are then sorted in parallel by two processes. So far, this is embarrassingly parallel, at least after the vector is divided in half.


pages: 429 words: 114,726

The Computer Boys Take Over: Computers, Programmers, and the Politics of Technical Expertise by Nathan L. Ensmenger

barriers to entry, business process, Charles Babbage, Claude Shannon: information theory, computer age, deskilling, Donald Knuth, Firefox, Frederick Winslow Taylor, functional programming, future of work, Grace Hopper, informal economy, information retrieval, interchangeable parts, Isaac Newton, Jacquard loom, job satisfaction, John von Neumann, knowledge worker, Larry Ellison, loose coupling, machine readable, new economy, no silver bullet, Norbert Wiener, pattern recognition, performance metric, Philip Mirowski, post-industrial society, Productivity paradox, RAND corporation, Robert Gordon, scientific management, Shoshana Zuboff, sorting algorithm, Steve Jobs, Steven Levy, systems thinking, tacit knowledge, technological determinism, the market place, The Theory of the Leisure Class by Thorstein Veblen, Thomas Kuhn: the structure of scientific revolutions, Thorstein Veblen, Turing machine, Von Neumann architecture, world market for maybe five computers, Y2K

It exists simultaneously as an idea, language, technology, and practice. Although intimately associated with the computer, it also clearly transcends it. For the most part software is invisible, ethereal, and ephemeral—and yet it is also obviously constructed. Certain aspects of software, such as a sorting algorithm, can be generalized and formalized as mathematical abstractions, while others remain inescapably local and specific, subject to the particular constraints imposed by corporate culture, informal industry standards, or government regulations. In this sense, software sits uncomfortably at the intersection of science, engineering, and business.


pages: 381 words: 120,361

Sunfall by Jim Al-Khalili

airport security, artificial general intelligence, augmented reality, Bletchley Park, Carrington event, cosmological constant, cryptocurrency, dark matter, David Attenborough, Fellow of the Royal Society, Higgs boson, imposter syndrome, Intergovernmental Panel on Climate Change (IPCC), Internet of things, invisible hand, Kickstarter, lockdown, mass immigration, megacity, MITM: man-in-the-middle, off grid, pattern recognition, quantum cryptography, quantum entanglement, Silicon Valley, smart cities, sorting algorithm, South China Sea, stem cell, Stephen Hawking, Turing test

Viewpoints ranging from enraged libertarians and conspiracy theorists disputing that anything had happened at all, to the even more vociferous end-of-the-world fanatics convinced this was finally the sign they had been waiting for, all competed for attention. The virtual-assistant system installed in the house, even running its highly sophisticated sorting algorithm, was proving unable to weed out the spam from the ham to build any reliable picture. Marc sighed. As usual, you had to do a little digging to get to the truth. ‘Select favourites only. Past twenty-four hours. Keywords: magnetic storm, solar flare, threat level.’ One thing he hadn’t done yet was change the VA’s settings, so it still spoke in the voice of the old British natural-history broadcaster Sir David Attenborough, a favourite of his mother’s: The top hit discussion is whether the current event was due directly to the weakening of the Earth’s magnetosphere – consensus rating 95.2 per cent – and how soon the Flip will happen and restore the planet’s protective magnetic shield – consensus on when this will occur is in the range of six months to five years from now.


Designing Interfaces by Jenifer Tidwell

A Pattern Language, business intelligence, cognitive load, crowdsourcing, Firefox, longitudinal study, school vouchers, seminal paper, social software, social web, sorting algorithm, the long tail, Tony Hsieh, web application

You should use up or down arrows to show whether the sort is in ascending or descending order. (And the presence of an arrow shows which column was last sorted on—a fortuitous side effect!) Consider using rollover effects, such as highlighting or cursor changes, on the headers to reinforce the impression of clickability. Use a stable sort algorithm. This means that if a user sorts first by name and then by date, the resultant list will show ordered groups of same-date items that are each sorted by name within the group. In other words, the current sort order will be retained in the next sort to the extent possible—subtle, but very useful.


How I Became a Quant: Insights From 25 of Wall Street's Elite by Richard R. Lindsey, Barry Schachter

Albert Einstein, algorithmic trading, Andrew Wiles, Antoine Gombaud: Chevalier de Méré, asset allocation, asset-backed security, backtesting, bank run, banking crisis, Bear Stearns, Black-Scholes formula, Bob Litterman, Bonfire of the Vanities, book value, Bretton Woods, Brownian motion, business cycle, business process, butter production in bangladesh, buy and hold, buy low sell high, capital asset pricing model, centre right, collateralized debt obligation, commoditize, computerized markets, corporate governance, correlation coefficient, creative destruction, Credit Default Swap, credit default swaps / collateralized debt obligations, currency manipulation / currency intervention, currency risk, discounted cash flows, disintermediation, diversification, Donald Knuth, Edward Thorp, Emanuel Derman, en.wikipedia.org, Eugene Fama: efficient market hypothesis, financial engineering, financial innovation, fixed income, full employment, George Akerlof, global macro, Gordon Gekko, hiring and firing, implied volatility, index fund, interest rate derivative, interest rate swap, Ivan Sutherland, John Bogle, John von Neumann, junk bonds, linear programming, Loma Prieta earthquake, Long Term Capital Management, machine readable, margin call, market friction, market microstructure, martingale, merger arbitrage, Michael Milken, Myron Scholes, Nick Leeson, P = NP, pattern recognition, Paul Samuelson, pensions crisis, performance metric, prediction markets, profit maximization, proprietary trading, purchasing power parity, quantitative trading / quantitative finance, QWERTY keyboard, RAND corporation, random walk, Ray Kurzweil, Reminiscences of a Stock Operator, Richard Feynman, Richard Stallman, risk free rate, risk-adjusted returns, risk/return, seminal paper, shareholder value, Sharpe ratio, short selling, Silicon Valley, six sigma, sorting algorithm, statistical arbitrage, statistical model, stem cell, Steven Levy, stochastic process, subscription business, systematic trading, technology bubble, The Great Moderation, the scientific method, too big to fail, trade route, transaction costs, transfer pricing, value at risk, volatility smile, Wiener process, yield curve, young professional

So I was given a chance to work on projects as the group expanded its role into quantitative analysis of banking issues. At first, the areas we were allowed to explore were more industrial than financial, such as a Monte Carlo simulation of check processing that was used to optimize a truck delivery routes from the branches to head office and check-sorting algorithms. Then we began to get involved with more financial applications—simulations of how the Bank’s earnings would be impacted by different economic environments, which led to the measurement of risk and return tradeoffs for different asset-liability management strategies. By 1977, I had moved out of the operations research division to head efforts to build models and systems for a new asset-liability Committee and then to start a modeling group to support decision making on the firm’s trading floor.


pages: 527 words: 147,690

Terms of Service: Social Media and the Price of Constant Connection by Jacob Silverman

"World Economic Forum" Davos, 23andMe, 4chan, A Declaration of the Independence of Cyberspace, Aaron Swartz, Airbnb, airport security, Amazon Mechanical Turk, augmented reality, basic income, Big Tech, Brian Krebs, California gold rush, Californian Ideology, call centre, cloud computing, cognitive dissonance, commoditize, company town, context collapse, correlation does not imply causation, Credit Default Swap, crowdsourcing, data science, deep learning, digital capitalism, disinformation, don't be evil, driverless car, drone strike, Edward Snowden, Evgeny Morozov, fake it until you make it, feminist movement, Filter Bubble, Firefox, Flash crash, game design, global village, Google Chrome, Google Glasses, Higgs boson, hive mind, Ian Bogost, income inequality, independent contractor, informal economy, information retrieval, Internet of things, Jacob Silverman, Jaron Lanier, jimmy wales, John Perry Barlow, Kevin Kelly, Kevin Roose, Kickstarter, knowledge economy, knowledge worker, Larry Ellison, late capitalism, Laura Poitras, license plate recognition, life extension, lifelogging, lock screen, Lyft, machine readable, Mark Zuckerberg, Mars Rover, Marshall McLuhan, mass incarceration, meta-analysis, Minecraft, move fast and break things, national security letter, Network effects, new economy, Nicholas Carr, Occupy movement, off-the-grid, optical character recognition, payday loans, Peter Thiel, planned obsolescence, postindustrial economy, prediction markets, pre–internet, price discrimination, price stability, profit motive, quantitative hedge fund, race to the bottom, Ray Kurzweil, real-name policy, recommendation engine, rent control, rent stabilization, RFID, ride hailing / ride sharing, Salesforce, self-driving car, sentiment analysis, shareholder value, sharing economy, Sheryl Sandberg, Silicon Valley, Silicon Valley ideology, Snapchat, social bookmarking, social graph, social intelligence, social web, sorting algorithm, Steve Ballmer, Steve Jobs, Steven Levy, systems thinking, TaskRabbit, technological determinism, technological solutionism, technoutopianism, TED Talk, telemarketer, transportation-network company, Travis Kalanick, Turing test, Uber and Lyft, Uber for X, uber lyft, universal basic income, unpaid internship, women in the workforce, Y Combinator, yottabyte, you are the product, Zipcar

Perhaps, they’ll decide to start directing me past restaurants that advertise with them and locations featuring billboards with their pay-per-gaze technology. As I pass these restaurants, I might receive ads or coupons in my Gmail in-box offering me a discount. Along the way, my entire urban experience potentially comes under the influence of Google. The maps example also raises some of the same problems we run into when thinking about sorting algorithms. Knowing that each action can influence various overseeing algorithms, do we adjust our behavior accordingly? Do we do things just so that the algorithms monitoring us won’t go off track, so that it’ll still “like” the things we like? You might already do a form of this—say, give a thumbs-up to a song on Pandora because you want to hear more of that type (assuming you trust Pandora’s system to usefully recognize a particular song type).


pages: 496 words: 70,263

Erlang Programming by Francesco Cesarini

cloud computing, fault tolerance, finite state, functional programming, higher-order functions, loose coupling, revision control, RFC: Request For Comment, social bookmarking, sorting algorithm, Turing test, type inference, web application

Hint: you will have to use a help function and concatenate the lists in several steps. Write a function that, given a list of nested lists, will return a flat list. Example: flatten([[1,[2,[3],[]]], [[[4]]], [5,6]]) ⇒ [1,2,3,4,5,6]. Hint: use concatenate to solve flatten. 84 | Chapter 3: Sequential Erlang Exercise 3-6: Sorting Lists Implement the following sort algorithms over lists: Quicksort The head of the list is taken as the pivot; the list is then split according to those elements smaller than the pivot and the rest. These two lists are then recursively sorted by quicksort, and joined together, with the pivot between them. Merge sort The list is split into two lists of (almost) equal length.


pages: 647 words: 43,757

Types and Programming Languages by Benjamin C. Pierce

Albert Einstein, combinatorial explosion, experimental subject, finite state, functional programming, Henri Poincaré, higher-order functions, Perl 6, power law, Russell's paradox, sorting algorithm, Turing complete, Turing machine, type inference, Y Combinator

(X→X→Bool) → List X → List X To test that sort is working correctly, we construct an out-of-order list, l = cons [Nat] 9 (cons [Nat] 2 (cons [Nat] 6 (cons [Nat] 4 (nil [Nat])))); sort it, l = sort [Nat] leqnat l; and read out the contents: nth = λX. λdefault:X. fix (λf:(List X)→Nat→X. λl:List X. λn:Nat. if iszero n then head [X] default l else f (tail [X] l) (pred n)); ▸ nth : "X. X → List X → Nat → X nth [Nat] 0 l 0; ▸ 2 : Nat nth [Nat] 0 l 1; ▸ 4 : Nat nth [Nat] 0 l 2; ▸ 6 : Nat nth [Nat] 0 l 3; ▸ 9 : Nat nth [Nat] 0 l 4; ▸ 0 : Nat The demonstration that a well-typed sorting algorithm could be implemented in System F was a tour de force by Reynolds (1985). His algorithm was a little different from the one presented here. 23.5.1 Solution The structure of the proof is almost exactly the same as for 9.3.9 (see page 107). For the type application rule E-TAPPTABS, we need one additional substitution lemma, paralleling Lemma 9.3.8 (see page 106).


Four Battlegrounds by Paul Scharre

2021 United States Capitol attack, 3D printing, active measures, activist lawyer, AI winter, AlphaGo, amateurs talk tactics, professionals talk logistics, artificial general intelligence, ASML, augmented reality, Automated Insights, autonomous vehicles, barriers to entry, Berlin Wall, Big Tech, bitcoin, Black Lives Matter, Boeing 737 MAX, Boris Johnson, Brexit referendum, business continuity plan, business process, carbon footprint, chief data officer, Citizen Lab, clean water, cloud computing, commoditize, computer vision, coronavirus, COVID-19, crisis actor, crowdsourcing, DALL-E, data is not the new oil, data is the new oil, data science, deep learning, deepfake, DeepMind, Demis Hassabis, Deng Xiaoping, digital map, digital rights, disinformation, Donald Trump, drone strike, dual-use technology, Elon Musk, en.wikipedia.org, endowment effect, fake news, Francis Fukuyama: the end of history, future of journalism, future of work, game design, general purpose technology, Geoffrey Hinton, geopolitical risk, George Floyd, global supply chain, GPT-3, Great Leap Forward, hive mind, hustle culture, ImageNet competition, immigration reform, income per capita, interchangeable parts, Internet Archive, Internet of things, iterative process, Jeff Bezos, job automation, Kevin Kelly, Kevin Roose, large language model, lockdown, Mark Zuckerberg, military-industrial complex, move fast and break things, Nate Silver, natural language processing, new economy, Nick Bostrom, one-China policy, Open Library, OpenAI, PalmPilot, Parler "social media", pattern recognition, phenotype, post-truth, purchasing power parity, QAnon, QR code, race to the bottom, RAND corporation, recommendation engine, reshoring, ride hailing / ride sharing, robotic process automation, Rodney Brooks, Rubik’s Cube, self-driving car, Shoshana Zuboff, side project, Silicon Valley, slashdot, smart cities, smart meter, Snapchat, social software, sorting algorithm, South China Sea, sparse data, speech recognition, Steve Bannon, Steven Levy, Stuxnet, supply-chain attack, surveillance capitalism, systems thinking, tech worker, techlash, telemarketer, The Brussels Effect, The Signal and the Noise by Nate Silver, TikTok, trade route, TSMC

In 2015, Amazon created a résumé-sorting model that developed a bias against women applicants. The model had been trained to identify traits of desirable employees based on prior résumés. Amazon’s workforce, as of 2017, was 60 percent male overall. (In many major tech firms, the workforce is even more predominantly male in technical roles.) The résumé-sorting algorithm reportedly learned to penalize résumés that included the word “women’s,” such as if an applicant stated she had played on a “women’s basketball team.” The algorithm also reportedly downgraded graduates of two all-women’s colleges. Amazon ultimately scrapped the algorithm, which the company said “was never used by Amazon recruiters to evaluate candidates.”


pages: 754 words: 48,930

Programming in Scala by Martin Odersky, Lex Spoon, Bill Venners

domain-specific language, functional programming, Guido van Rossum, higher-order functions, Larry Wall, off-by-one error, Silicon Valley, sorting algorithm, the Cathedral and the Bazaar, type inference, web application

One simple way to do so is insertion sort, which works as follows: To sort a non-empty list x :: xs, sort the Cover · Overview · Contents · Discuss · Suggest · Glossary · Index 329 Section 16.5 Chapter 16 · Working with Lists Table 16.1 · Basic list operations What it is empty.isEmpty fruit.isEmpty fruit.head fruit.tail.head diag3.head What it does returns true returns false returns "apples" returns "oranges" returns List(1, 0, 0) remainder xs and insert the first element x at the right position in the result. Sorting an empty list yields the empty list. Expressed as Scala code, the insertion sort algorithm looks like: def isort(xs: List[Int]): List[Int] = if (xs.isEmpty) Nil else insert(xs.head, isort(xs.tail)) def insert(x: Int, xs: List[Int]): List[Int] = if (xs.isEmpty || x <= xs.head) x :: xs else xs.head :: insert(x, xs.tail) 16.5 List patterns Lists can also be taken apart using pattern matching.


pages: 968 words: 224,513

The Art of Assembly Language by Randall Hyde

Donald Knuth, Free Software Foundation, machine readable, P = NP, p-value, sorting algorithm, Von Neumann architecture, Y2K

Indeed, although it is possible to modify the code above so that, on the average, it runs about twice as fast, such optimizations are wasted on such a poor algorithm. However, the bubble sort is very easy to implement and understand (which is why introductory texts continue to use it in examples). * * * [63] Fear not, you'll see some better sorting algorithms in Chapter 5. 4.22 Multidimensional Arrays The 80x86 hardware can easily handle single-dimensional arrays. Unfortunately, there is no magic addressing mode that lets you easily access elements of multidimensional arrays. That's going to take some work and several instructions. Before discussing how to declare or access multidimensional arrays, it would be a good idea to figure out how to implement them in memory.


pages: 643 words: 53,639

Rapid GUI Programming With Python and Qt by Mark Summerfield

Debian, duck typing, Guido van Rossum, loose coupling, MVC pattern, software patent, sorting algorithm, web application

If the movie’s title or year has changed, the movie may now be in the wrong position in the __movies list. In these cases, we find the movie using its original title and year, set the new title and year, and then re-sort the list. This is not as expensive in practice as it may at first appear. The list will contain, at most, one incorrectly sorted item, and Python’s sort algorithm is highly optimized for partially sorted data. If we ever found that we had a performance problem here, we could always reimplement updateMovie() using delete() and add() instead. @staticmethod def formats(): return "*.mqb *.mpb *.mqt *.mpt" Normally, we would provide one, or at most two, custom data formats for an application, but for the purposes of illustration we provide three formats using four extensions.


pages: 496 words: 174,084

Masterminds of Programming: Conversations With the Creators of Major Programming Languages by Federico Biancuzzi, Shane Warden

Benevolent Dictator For Life (BDFL), business intelligence, business logic, business process, cellular automata, cloud computing, cognitive load, commoditize, complexity theory, conceptual framework, continuous integration, data acquisition, Dennis Ritchie, domain-specific language, Douglas Hofstadter, Fellow of the Royal Society, finite state, Firefox, follow your passion, Frank Gehry, functional programming, general-purpose programming language, Guido van Rossum, higher-order functions, history of Unix, HyperCard, industrial research laboratory, information retrieval, information security, iterative process, Ivan Sutherland, John von Neumann, Ken Thompson, Larry Ellison, Larry Wall, linear programming, loose coupling, machine readable, machine translation, Mars Rover, millennium bug, Multics, NP-complete, Paul Graham, performance metric, Perl 6, QWERTY keyboard, RAND corporation, randomized controlled trial, Renaissance Technologies, Ruby on Rails, Sapir-Whorf hypothesis, seminal paper, Silicon Valley, slashdot, software as a service, software patent, sorting algorithm, SQL injection, Steve Jobs, traveling salesman, Turing complete, type inference, Valgrind, Von Neumann architecture, web application

If you’re writing a general purpose sorting routine, then you have to call a subroutine or do something like that to determine whether item A is less than item B, whatever that is. If you’re trying to sort keys to records or something, then you have to know the ordering of whatever it is you’re sorting. They may be different kinds of things. Sometimes they may be character strings, but think of what the possibilities are. When you write the sorting algorithm, you don’t know any of that stuff, which means it has to be put in later. If it’s done at runtime, of course, then it’s runtime interpretation. That can be done efficiently, don’t get me wrong, because you can have a little program, a subroutine, and then all the person has to do is, in the subroutine, to write the rules for ordering the elements that he’s sorting.


pages: 1,065 words: 229,099

Real World Haskell by Bryan O'Sullivan, John Goerzen, Donald Stewart, Donald Bruce Stewart

bash_history, database schema, Debian, distributed revision control, domain-specific language, duck typing, en.wikipedia.org, Firefox, functional programming, general-purpose programming language, Guido van Rossum, higher-order functions, job automation, Larry Wall, lateral thinking, level 1 cache, machine readable, p-value, panic early, plutocrats, revision control, sorting algorithm, SQL injection, transfer pricing, type inference, web application, Yochai Benkler

Now, we’d like to check that this function obeys the basic rules a good sort should follow. One useful invariant to start with, and one that comes up in a lot of purely functional code, is idempotency—applying a function twice has the same result as applying it only once. For our sort routine—a stable sort algorithm—this should certainly be true, or things have gone horribly wrong! This invariant can be encoded as a property simply, as follows: -- file: ch11/QC-basics.hs prop_idempotent xs = qsort (qsort xs) == qsort xs We’ll use the QuickCheck convention of prefixing test properties with prop_ in order to distinguish them from normal code.


The Art of SEO by Eric Enge, Stephan Spencer, Jessie Stricchiola, Rand Fishkin

AltaVista, barriers to entry, bounce rate, Build a better mousetrap, business intelligence, cloud computing, content marketing, dark matter, en.wikipedia.org, Firefox, folksonomy, Google Chrome, Google Earth, hypertext link, index card, information retrieval, Internet Archive, Larry Ellison, Law of Accelerating Returns, linked data, mass immigration, Metcalfe’s law, Network effects, optical character recognition, PageRank, performance metric, Quicken Loans, risk tolerance, search engine result page, self-driving car, sentiment analysis, social bookmarking, social web, sorting algorithm, speech recognition, Steven Levy, text mining, the long tail, vertical integration, Wayback Machine, web application, wikimedia commons

The basic tenets are: Don’t simply republish something that’s found elsewhere on the Web unless your site adds substantive value to users, and don’t infringe on others’ copyrights or trademarks. If you’re hosting affiliate content, expect to be judged more harshly than others, as affiliates in the SERPs are one of users’ top complaints about search engines. Small changes such as a few comments, a clever sorting algorithm or automated tags, filtering, a line or two of text, simple mashups, or advertising do not constitute “substantive value.” For some exemplary cases where websites fulfill these guidelines, check out the way sites such as CNET (http://reviews.cnet.com), Urbanspoon (http://www.urbanspoon.com), and Metacritic (http://www.metacritic.com) take content/products/reviews from elsewhere, both aggregating and “adding value” for their users.


pages: 936 words: 85,745

Programming Ruby 1.9: The Pragmatic Programmer's Guide by Dave Thomas, Chad Fowler, Andy Hunt

book scanning, David Heinemeier Hansson, Debian, domain-specific language, duck typing, Jacquard loom, Kickstarter, Neal Stephenson, off-by-one error, p-value, revision control, Ruby on Rails, slashdot, sorting algorithm, web application

If the block returns true (that is, a value other than nil or false), the method returns the corresponding element. If no 3. The basic Fibonacci series is a sequence of integers, starting with two 1s, in which each subsequent term is the sum of the two preceding terms. The series is sometimes used in sorting algorithms and in analyzing natural phenomena. 4. The find method is actually defined in module Enumerable, which is mixed into class Array. Report erratum B LOCKS AND I TERATORS 79 element matches, the method returns nil. The example shows the benefit of this approach to iterators. The Array class does what it does best, accessing array elements, and leaves the application code to concentrate on its particular requirement (in this case, finding an entry that meets some criteria).


Programming Python by Mark Lutz

Benevolent Dictator For Life (BDFL), Build a better mousetrap, business logic, business process, cloud computing, Firefox, general-purpose programming language, Google Chrome, Guido van Rossum, iterative process, linear programming, loose coupling, machine readable, MVC pattern, natural language processing, off grid, slashdot, sorting algorithm, web application

\PP4E\System\Streams> type adder3.py import sys sum = 0 for line in sys.stdin: sum += int(line) print(sum) Changing sorter to read line by line this way may not be a big performance boost, though, because the list sort method requires that the list already be complete. As we’ll see in Chapter 18, manually coded sort algorithms are generally prone to be much slower than the Python list sorting method. Interestingly, these two scripts can also be coded in a much more compact fashion in Python 2.4 and later by using the new sorted built-in function, generator expressions, and file iterators. The following work the same way as the originals, with noticeably less source-file real estate: C:\...

\PP4E\Dstruct\Classics> python sort2.py [{'name': 'john'}, {'name': 'doe'}] ({'name': 'doe'}, {'name': 'john'}) abcxyz This version also dispenses with the notion of a field altogether and lets the passed-in function handle indexing if needed. That makes this version much more general; for instance, it’s also useful for sorting strings. Data Structures Versus Built-ins: The Conclusion But now that I’ve shown you these reversing and sorting algorithms, I need to also tell you that they may not be an optimal approach, especially in these specific cases. Although these examples serve an educational role, built-in lists and functions generally accomplish what we just did the hard way: Built-in sorting tools Python’s two built-in sorting tools are so fast that you would be hard-pressed to beat them in most scenarios.


The Art of Computer Programming by Donald Ervin Knuth

Abraham Wald, Brownian motion, Charles Babbage, complexity theory, correlation coefficient, Donald Knuth, Eratosthenes, G4S, Georg Cantor, information retrieval, Isaac Newton, iterative process, John von Neumann, Louis Pasteur, mandelbrot fractal, Menlo Park, NP-complete, P = NP, Paul Erdős, probability theory / Blaise Pascal / Pierre de Fermat, RAND corporation, random walk, sorting algorithm, Turing machine, Y2K

But from the fact that F(x) is increasing and the fact that Fn(x) increases only in finite steps, we can derive a simple procedure for evaluating the statistics K+ and K~: Step 1. Obtain independent observations Xi, X2, ¦ ¦ ¦ ,Xn . Step 2. Rearrange the observations so that they are sorted into ascending order, X\ < X2 < ¦ ¦ ¦ < Xn. (Efficient sorting algorithms are the subject of Chapter 5. But it is possible to avoid sorting in this case, as shown in exercise 23.) Step 3. The desired statistics are now given by the formulas l<3<n\n An appropriate choice of the number of observations, n, is slightly easier to make for this test than it is for the x2 test, although some of the considerations are similar.