5 results back to index
Forge Your Future with Open Source by VM (Vicky) Brasseur
AGPL, anti-pattern, Benevolent Dictator For Life (BDFL), call centre, continuous integration, Contributor License Agreement, Debian, DevOps, don't repeat yourself, en.wikipedia.org, Firefox, FOSDEM, Free Software Foundation, Guido van Rossum, information security, Internet Archive, Larry Wall, microservices, Perl 6, premature optimization, pull request, Richard Stallman, risk tolerance, Turing machine
Focused purely on design and development of the next version of OpenStack, the PTG attracts many hundred community members and contributors. The OpenStack Summit—held twice a year—regularly hosts many thousand attendees and is one of the largest FOSS-related events in the world. The actual largest event of that kind is FOSDEM. Unlike the other events, FOSDEM does not focus on a single community or technology. Instead, it welcomes all people interested or participating in free and open source software. Each year, more than 6000 attendees from all over the globe migrate to Brussels, Belgium at the start of February to learn from and meet with their fellow FOSS enthusiasts.
Vagrant: Up and Running by Mitchell Hashimoto
Amazon Web Services, barriers to entry, Debian, DevOps, FOSDEM, remote working, software as a service, web application
about, Preface, An Introduction to Vagrant alternatives to, Alternatives to Vagrant setting up, Setting Up Vagrant–Conflicting RubyGems Installation common mistakes, Common Mistakes installation, Installing Vagrant–Linux VirtualBox, Installing VirtualBox Vagrantfile about, The Vagrantfile defaults, Setting Vagrantfile Defaults VAGRANT_CWD, VAGRANT_CWD VAGRANT_HOME, VAGRANT_HOME VAGRANT_LOG, VAGRANT_LOG VAGRANT_NO_PLUGINS, VAGRANT_NO_PLUGINS VAGRANT_VAGRANTFILE, VAGRANT_VAGRANTFILE validation, plug-in configuration, Validation version 1 plug-ins, Plug-In Definition version control, Up versions, Installing Vagrant virtual machine, plug-in custom commands, Working with the Virtual Machine–Parsing Command-Line Options VirtualBox export, Box Format installation, Installing VirtualBox installing guest additions, Installing VirtualBox Guest Additions machine, Creating the VirtualBox Machine using Vagrant without, Using Vagrant Without VirtualBox virtualization, Preface, Plain Desktop Virtualization W Windows environmental variable, Troubleshooting and Debugging installing Vagrant, Windows working directory, Hiera Data About the Author Mitchell Hashimoto is a passionate engineer, professional speaker, and entrepreneur. Mitchell has been creating and contributing to open source software for almost a decade. He has spoken at dozens of conferences about his work, such as VelocityConf, OSCON, FOSDEM, and more. Mitchell is the founder of HashiCorp, a company whose goal is to make the best DevOps tools in the world, including Vagrant. Prior to HashiCorp, Mitchell spent five years as a web developer and another four as an operations engineer. Colophon The animal on the cover of Vagrant: Up and Running is a blue rock pigeon (Columba livia).
MongoDB: The Definitive Guide by Kristina Chodorow, Michael Dirolf
create, read, update, delete, Debian, FOSDEM, pattern recognition, Ruby on Rails, web application
.), querying for embedded keys, passing finalize function to MapReduce, 90 54 find( ) method, 13, 45 chaining limit( ) method to, 57 186 | Index chaining skip method to, 57 hardware failures, 41 geospatial queries, 77 hashes in Ruby, 165 sorting returns, 58 hasNext( ) method, 56 specifying keys to return, 46 help command, 14 findAndModify command, 39–41, 96 hint tool, 75 values for keys, 40 HTTP admin interface, 10, 115 findOne( ) method, 13 fire-and-forget functions, 41 I fixed-size collections, 3 floating-point numbers, 16 _id keys, 20 representation of integers as doubles in autogeneration of, 22 MongoDB, 18 DBRefs versus, 108 forEach loop, using cursor class in, 56 GridFS, 103 fs.chunks collection, 103 unique indexes on, 69 fs.files collection, 103 immortal function, 63 fsync command, 123 importing data, using batch inserts, 24 functions $in conditional JavaScript, defining and calling, 11 type-specific queries, 50 printing JavaScript source code for, 14 using in OR queries, 48 using as keys, 86 $inc update modifier, 28 incrementing a counter, 169 incrementing and decrementing with, 30 G indexes geoNear command, 78 administration of, 75 geospatial indexes, 77–79 adding, removing, and dropping all assumption of a flat plane, spherical earth indexes, 76 and, 79 $all conditional and, 159 compound, 78 for collections of documents in MongoDB, values range, 77 7 getLastError command, 38, 93, 96 compound geospatial indexes, 78 Github, 160 dropIndexes command, 95 gps key, 77 forcing Mongo to use indexes you want for GridFS, 101–104 a query, 75 file storage, 101 real-time analytics using PyMongo getting started with, using mongofiles, 102 (example), 170 how it works, 103 unique, 69 use of compound unique index, 70 compound unique indexes, 70 working with, from MongoDB drivers, 102 dropping duplicates, 70 group command, 82 uniquely identifying, 69 component keys, 83 indexing, 2, 65–79 condition for documents to be processed, on all keys in your query, 66 83 arrays, selecting elements by index, 34 using a finalizer, 85 chemical search engine using Java driver using function as key, 86 (example), 157 $gt (greater than) conditional, 47 creating compound index, 158 $gte (greater than or equal to) conditional, 47 disadvantage of indexes, 67 geospatial, 77–79 H keys in embedded documents, 68 on multiple keys, considering index handshake command, 140 direction, 66 Index | 187 questions to consider when creating data types, 16 indexes, 68 scaling indexes, 68 K on single key used in query, 65 for sorts, 69 $keyf key, 86 using explain, 70–75 key/value pairs using hint, 75 functions as keys, 86 insert( ) method, 12 keys in MapReduce operations, 90 inserts, 23 in MongoDB documents, 6 batch inserts, 23 specifying keys to return with find method, into capped collections, 98 46 insert( ) method, 12 keys interleaved inserts/queries, 44 removing with $unset update modifier, 29 internals and implications, 24 setting value with $set update modifier, 29 safe inserts for documents with duplicate kill command, 114 value for unique key, 69 safe operations, 42 L upserts, 36 latitude and longitude in geospatial indexing, installation, MongoDB, 173–176 77 choosing a version, 173 length key (GridFS), 103 POSIX install on Linux, Mac OS X, and libraries (JavaScript), leveraging in MongoDB Solaris, 175 shell, 11 Windows install, 174 limits for query results, 57 integers Linux 32- and 64-bit, 16 installing MongoDB, 175 basic data types and, 16 installing PHP driver, 161 representation as floating-point numbers, listCommands command, 95, 96 18 listDatabases command, 96 isMaster command, 96 local database, 9, 139 user for slave and master server, 142 J local.system.replset namespace, 133 Java local.system.users namespace, 142 documentation for Java driver and articles, localhost, running shard on, 148 156 locking search engine for chemical compounds, fsync command, holding a lock, 123 155–159 information about, 117 java.util.ArrayList, 158 logging JavaScript creating function for JavaScript code, 105 Date class, 19 inspecting MongoDB log after installation, executing as part of query with $where 113 clause, 55 use of capped collections for, 99 MongoDB shell, 11 $lt (less than) conditional, 47 server-side execution, disallowing, 121 server-side scripting, 104–107 M db.eval( ) function, 104 security and, 106 Mac OS X stored JavaScript, 105 installing MongoDB, 175 stored, in MongoDB, 3 installing PHP driver, 160 JSON (JavaScript Object Notation), 16 manual sharding, 143 188 | Index map collection, finding all documents in order --nohttpinterface, 115 by distance from a point, 77 --noscripting, 121 map step (MapReduce), 86 --oplogSize, 139 getting all keys of all documents in --port, 112, 115 collection, 87 --repair, 125 MapReduce, 3, 86–92 stopping, 10, 114 finalize function passed to, 90 mongod.exe, installation options, 175 finding all keys in a collection, 87 MongoDB MapReduce function in MongoDB, 88 advantages offered by, 1–4 metainformation in document returned, data types, 15 88 getting and starting, 10 getting more output from, 92 installing, 173–176 keeping output collections from, 90 shell, 11–15 optional keys that can be passed to, 90 MongoDB Java Language Center, 156 running on subset of documents, 91 mongodump utility, 122 using scope with, 92 mongofiles utility, 102 master-slave replication, 127–130 mongorestore utility, 123 adding and removing sources, 129 mongos routing process, 144, 147 options, 128 connecting to, 148 Math.random function, 60 running multiple, 150 maximum value type, 17 mongostat utility, 118 md5 key (GridFS), 104 monitoring, 114–118 memory management, 3 server health and performance, using information on memory from serverStatus, serverStatus, 116 117 third-party plug-ins for, 118 keeping index in memory, 68 using admin interface, 115 memory-mapped storage engine, 181 using mongostat for serverStatus output, metadata for namespaces, 180 118 minimum value type, 17 modifiers, update (see update modifiers) N mongo (shell), 177–178, 177 (see also shell) namespaced subcollections, 8 --nodb option, 177 namespaces, 9 utilities, 178 and extents, 180 Mongo class (Java), 155 for indexes, 76 mongo gem, installing, 164 naming Mongo::Connection class (Ruby), 165 collections, 8 mongod executable databases, 9 --master option, 128 indexes, 69 --replSet option, 132 natural sorts, 99 --rest option, 115 $ne (not equal) conditional, 47 running, 10 $near conditional, 77 --slave option, 128 news aggregator using PHP (example), 160– startup options, 112 164 --bindip, 121 next( ) method, using on cursor, 56 --config, 112 $nin (not in) conditional, 48 --dbpath, 112, 176 nodes --fork, 112 master and slave (see master-slave --logpath, 112 replication) types in replica sets, 133 Index | 189 $not conditional, 49 finding in a shape, 78 null character (\0), 8 $pop array modifier, 34 null type, 16 positional operator ($), 34 queries for, 49 POSIX install (MongoDB), 175 numbers, data types for, 18 preallocation of data files, 180 primary, 130 O primary node, 133 failure of, and election of new primary, 135 object id type, 17 printReplicationInfo( ) function, 141 ObjectIds, 20–22 printShardingStatus( ) function, 152 oplog, 99, 138 priority key, 134 changing size of, 142 processes getting information about, 141 PID for ObjectId-generating process, 22 OR queries, 48 status of, 39 org.bson.DBObject interface, 156 $pull array modifier, 34 $push array modifier, 32 P PyMongo package manager, installing MongoDB from, DBRef type, 108 176 real-time analytics application (example), pagination 168–171 combining find, limit, and sort methods for, pymongo.connection.Connection class, 168 58 Python Package Index, 168 query results without skip, 59 partitioning, 143 Q passive nodes, 134 queries, 45–63 performance, 3 commands implemented as, 94 index creation and, 76 cursors, 56–63 indexes and, 67 advanced query options, 60 price of safe operations, 42 avoiding large skips, 59–60 speed of update modifiers, 35 getting consistent results, 61 Perl limits, skips, and sorts, 57 $ (dollar sign) in MongoDB update find( ) method, 45 modifiers, 28 specifying keys to return, 46 Perl Compatible Regular Expression (PCRE) geospatial, 77–79 library, 50 handling on slave servers, 137 PHP matching more complex criteria, 47 $ (dollar sign) in MongoDB update conditionals, 47 modifiers, 28 $not conditional, 49 news aggregator application (example), OR queries, 48 160–164 rules for conditionals, 49 using tailable cursor in, 101 querying on embedded documents, 53 PID (process identifier) for ObjectId-generating restrictions on, 47 process, 22 type-specific, 49–53 ping command, 96 arrays, 51–53 plain queries, 60 null type, 49 point-in-time data snapshots, 123 regular expressions, 50 points $where clauses in, 55 finding document in map collection by order query optimizer, 3 of distance from, 77 choosing index to use, 75 190 | Index reordering query terms to take advantage of batch inserts and, 24 indexes, 67 connections and, 43 quotation marks in strings, 28 reserved database names, 9 REST support, 115 R restores, 121 (see also administration, backup and repair) random function, 60 using mongorestore utility, 123 range queries, using conditionals, 47 resync command, 139 read scaling with slave servers, 137 retrieve operations, MongoDB shell, 13 real-time analytics, MongoDB for, 169 routing process (mongos), 144, 147, 148 reduce step (MapReduce), 86 Ruby calling reduce in MongoDB (example), 87 custom submission forms application references to documents, uniquely identifying (example), 164–167 (see database references) object mappers and using MongoDB with regular expressions, 50 Rails, 167 MongoRegex, 161 RubyGems, 164 regular expression type, 17 runCommand( ) function, 93 relational databases document-oriented databases versus, 1 features not available in MongoDB, 3 S remove( ) function, 14 safe operations, 42 query document as parameter, 25 catching normal errors, 43 removes, 25 save function, 38 safe operations, 42 scaling with MongoDB, 2 speed of, 25 schema-free collections, 7 removeshard command, 152 schema-free MongoDB, 2 renameCollection command, 96 schemas repair of corrupt data files, 124 chemical search engine using Java driver repairDatabase command, 96 (example), 156 repairDatabase( ) method, 125 example schema using DBRefs, 107 replica sets, 130 real-time analytics application using failover and primary election, 135 PyMongo, 169 initializing, 132 scope, using with MapReduce, 92 keys in initialization document, 133 search engine for chemicals, using Java driver nodes in, 133 (example), 155–159 shards as, 150 secondaries, 130 replication, 127–142 secondary nodes, 133 administration of, 141 security authentication, 142 authentication, 118–120 changing oplog size, 142 execution of server-side JavaScript, 106 diagnostics, 141 other considerations, 121 blocking for, 140 server-side scripting, 104–107 master-slave, 127–130 disallowing server-side JavaScript oplog, 138 execution, 121 performing operations on a slave, 136 servers replica sets (see replica sets) database server offloading processing and replication state and local database, 139 logic to client side, 3 syncing slave to master node, 139 for production sharding, 150 replSetInitiate command, 132 serverStatus command, 96, 116–118 requests Index | 191 information from, printing with mongostat, shutdown command, 114 118 SIGINT or SIGTERM signal, 114 $set update modifier, 29 skips shapes, finding all documents within, 78 avoiding large skips, 59–60 shardCollection command, 149 finding a random document, 59 sharding, 143–153 skipping query results, 57 administration, 150–153 slave nodes config collections, 150 adding and removing sources, 129 printShardingStatus command, 152 secondaries, in replica sets, 130 removeshard command, 152 setting up, 128 autosharding in MongoDB, 144 syncing to master node, 139 database variable pointing to config slave servers database, 178 backups, 124 defined, 143 performing operations on, 136 production configuration, 149 $slice operator, 52 setting up, 147 snapshots of data, 123 sharding data, 148 Solaris, installing MongoDB, 175 starting servers, 147 sorting shard keys, 145 find( ) method results, 58 effects of shard keys on operations, 146 indexing for, 69 existing collection, 145 natural sorts, 99 incrementing, versus random shard keys, sources collection, 140 146 sources for slave nodes, 129 when to shard, 145 standard nodes, 134 shards, 144 starting MongoDB, 111–113 adding a shard, 148 file-based configuration, 113 defined, 147 from command line, 112 listing in shards collection, 150 status of processes, 39 replica sets as, 150 stopping MongoDB, 114 shell, 5, 11–15, 177–178 storage engine, 113, 181 connecting to database, 177 strings create operations, 12 matching with regular expressions, 50 creating a cursor, 56 string type, 16 delete operations, 14 subcollections, 8 figuring out what functions are doing, 14 accessing using variables, 15 help with, 14 submission forms (custom), using Ruby, 164– JavaScript functions provided by, 167 autogenerated API, 15 symbol type, 17 MongoDB client, 12 syncedTo, 140 repairing single database on running server, syncing slave to master node, 139 125 system prefix, collection names, 8 retrieve operations, 13 system.indexes collection, 75 running, 11 system.js collection, 105 running scripts, 37 system.namespaces collection, lists of index save function, 38 names, 76 starting without connecting to database, 177 T update operations, 13 utilities, 178 table scans, 66 tailable cursors, 101 192 | Index third-party plug-ins for monitoring, 118 W timestamps in ObjectIds, 21 web page for this book, xvi stored in syncedTo, 140 web pages uploadDate in GridFS, 104 categorizing, using MapReduce, 89 trees of comments (news aggregator example), tracking views with analytics application, 162 168–170 type-sensitivity in MongoDB, 6 $where clauses in queries, 55 Windows systems U installing MongoDB, 174 installing PHP driver, 160 undefined type, 17 running mongod executable, 10 Unix, installing PHP driver, 161 wire protocol, 180 update modifiers, 27–36 $within conditional, 78 $ positional operator, 34 wrapped queries, 60 $inc, 28, 30 $set, 29 $unset, 29 array modifiers, 31–34 $addToSet, 33 $ne, 32 $pop, 34 $pull, 34 $push, 32 speed of, 35 updates, 26 replacing a document, 26 returning updated documents, 39–41 safe operations, 42 update operations, MongoDB shell, 13 updating multiple documents, 38 upserts, 36 using modifiers, 27–36 uploadDate key (GridFS), 104 upserts real-time analytics application (example), 170 real-time analytics using MongoDB, 169 save shell helper function, 38 V values in documents, 6 variables JavaScript, in shell, 12 using to access subcollections, 15 versions, MongoDB, 173 voting, implementing, 164 Index | 193 About the Authors Kristina Chodorow is a core contributor to the MongoDB project. She has worked on the database server, PHP driver, Perl driver, and many other MongoDB-related projects. She has given talks at conferences around the world, including OSCON, LinuxCon, FOSDEM, and Latinoware, and maintains a website about MongoDB and other topics at http://www.snailinaturtleneck.com. She works as a software engineer for 10gen and lives in New York City. Michael Dirolf, also a software engineer at 10gen, is the lead maintainer for PyMongo (the MongoDB Python driver), and the former maintainer for the MongoDB Ruby driver.
HBase: The Definitive Guide by Lars George
Alignment Problem, Amazon Web Services, bioinformatics, create, read, update, delete, Debian, distributed revision control, domain-specific language, en.wikipedia.org, fail fast, fault tolerance, Firefox, FOSDEM, functional programming, Google Earth, information security, Kickstarter, place-making, revision control, smart grid, sparse data, web application
Cloud Serving Benchmark), YCSB, YCSB young (new) generation of heap, Garbage Collection Tuning Z ZFS filesystem, Filesystem Zippy algorithm, Available Codecs, Snappy zk_dump command, HBase Shell, Tools zoo.cfg file, ZooKeeper setup, ZooKeeper setup ZooKeeper, Implementation, ZooKeeper setup, Using the existing ZooKeeper ensemble, ZooKeeper setup, Using the existing ZooKeeper ensemble, Running and Confirming Your Installation, Connection Handling, Tools, Main page, ZooKeeper page, Overview, Region splits, ZooKeeper, ZooKeeper, Transactions, Configuration, ZooKeeper problems, ZooKeeper problems, HBase Configuration Properties, HBase Configuration Properties existing cluster, setting up for HBase, Using the existing ZooKeeper ensemble information about, retrieving, Tools, Main page, ZooKeeper page number of members to run, ZooKeeper setup properties for, HBase Configuration Properties, HBase Configuration Properties role in data access, Overview setup for fully distributed mode, ZooKeeper setup, Using the existing ZooKeeper ensemble sharing connections to, Connection Handling splits tracked by, Region splits starting, Running and Confirming Your Installation timeout for, Configuration for transactions, Transactions troubleshooting, ZooKeeper problems, ZooKeeper problems znodes for, ZooKeeper, ZooKeeper zookeeper.session.timeout property, ZooKeeper setup, JVM Metrics, Configuration, HBase Configuration Properties zookeeper.znode.parent property, ZooKeeper, Choosing region servers to replicate to, HBase Configuration Properties zookeeper.znode.rootserver property, HBase Configuration Properties About the Author Lars George has been involved with HBase since 2007, and became a full HBase committer in 2009. He has spoken at various Hadoop User Group meetings, as well as large conferences such as FOSDEM in Brussels. He also started the Munich OpenHUG meetings. He now works closely with Cloudera to support Hadoop and HBase in and around Europe through technical support, consulting work, and training. Colophon The animal on the cover of HBase: The Definitive Guide is a Clydesdale horse. Named for the district in Scotland where it originates, the breed dates back to the early nineteenth century, when local mares were crossed with imported Flemish stallions.
Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale by Jan Kunigk, Ian Buss, Paul Wilkinson, Lars George
Amazon Web Services, barriers to entry, bitcoin, business intelligence, business logic, business process, cloud computing, commoditize, computer vision, continuous integration, create, read, update, delete, data science, database schema, Debian, deep learning, DevOps, domain-specific language, fault tolerance, Firefox, FOSDEM, functional programming, Google Chrome, Induced demand, information security, Infrastructure as a Service, Internet of things, job automation, Kickstarter, Kubernetes, level 1 cache, loose coupling, microservices, natural language processing, Network effects, platform as a service, single source of truth, source of truth, statistical model, vertical integration, web application
After a torrent of professional services work across financial services, cybersecurity, adtech, gaming, and government, he’s seen it all, warts and all. Or at least, he hopes he has. Lars George has been involved with Hadoop and HBase since 2007, and became a full HBase committer in 2009. He has spoken at many Hadoop User Group meetings, and at conferences such as Hadoop World and Hadoop Summit, ApacheCon, FOSDEM, and QCon. He also started the Munich OpenHUG meetings. Lars worked for Cloudera for over five years as the EMEA chief architect, acting as a liaison between the Cloudera professional services team and customers and working with partners in and around Europe, building the next data-driven solutions.