Cassandra Database overwhelmed?


I created a table in a cassandra database with the following query:

CREATE TABLE table( num int, part_key int, val1 int, val2 float, val3 text, ..., PRIMARY KEY((part_key),num) );

The table stores data from a technical device. The partitioning key part_key is 1 for the every record, because I want to execute range queries on only one server. I know this is a bad use case for Cassandra, but I need to do this for a comparison.

The primary key num is the number of the record (from 1 to 8.000.000).

There are like 400 other values per record that are float, int and text type. I Inserted 8.000.000 records to this table (43 GB) and wanted to run my queries like:

SELECT num, val1, val45, val90 FROM ks.table WHERE part_key=1 AND num>9999 AND num<20001;

I executed the query in the cql-shell and got "operation timed out". So I changed read_request_timeout_in_ms and range_request_timeout_in_ms in the cassandra.yaml file to 60000 (2 minutes).

When executing the query again I got "Error 10054: the existing connection was closed by the remotehost" after 5 minutes. The Datastax Cassandra Community Server 2.0.11 Service was not running anymore on the server.

I restarted the service, tried it again and the service crashed again. I could not even restart the service and I had to restart the server. I also tried this using the Cassandra cpp-driver and also could not execute this query.

Small queries like

... AND num<1000;

are still possible.

My question is: did I do something wrong? I know Cassandra is better with more nodes but I thought that Cassandra would only need some more time. Is it possible, that Cassandra is unable to execute a query like that?

Thank You!

The server:

Intel(R) Xeon(R) CPU E5504 @ 2.00GHz 2.00GHz (2 processors) / 16GB RAM

CPU utilization: 50% - 60% and after 15 seconds around 30% / RAM: 2,9 GB the whole time


My Cassandra keyspace is now 60GB and small queries like

... AND num<10;

and even the Inserts return time out. Sometimes the service crashes... Please can someone who got an idea explain that? One answer said that a node with 43GB is not the same in a cluster with more nodes as in my cluster with only one node. Can somebody explain this?

Thank You!


One of the key issues here is that cqlsh with the version of C* that you are running does not page through results. This means the entire result-set has to be serialized at the time of the query which given your data model will be quite large(as pointed out by kha). I would try performing similar queries using a paging enabled driver and of course make sure that you have sufficient network bandwith for returning the data.

43GB Should be easily handled by a single C* node, although operating a C* cluster with only a single node sacrifices almost all of the benefits that C* offers.


  • Android store byte[] in the keystore
  • Cassandra CQL 3 - Prefix Select
  • How to check connection of cassandra with pentaho data integrator
  • Using Python and BeautifulSoup to Parse a Table
  • Cassandra Batch Insert in Python
  • Spring-Kafka Integration 1.0.0.RELEASE Issue with Producer
  • java keystore and password changing
  • Matlab and mechanics (mostly physics)
  • How do I add conditionally to a selection of cells in a pandas dataframe column when the the column
  • How to return a HashTable from a WebService?
  • How can I substitute my own custom dynamic scaffolding methods
  • ASP Net Core - Mixing External Identity Provider with Individual User Accounts for Audit Tracking
  • How do I get brown text in a batch file?
  • Getting Application-defined or object-defined error
  • spring data neo4j 3.0.0 - why two labels set by default
  • Grouping by blank nodes
  • Does sql server minds the way records where inserted?
  • Linux command line : edit hacked index files
  • Thread synchronization with syncwarp
  • How can I run DataNucleus Bytecode Enhancer from SBT?
  • Jquery Knockout: ko.computed() vs classic function?
  • Updating both a ConcurrentHashMap and an AtomicInteger safely
  • How to avoid particles glitching together in an elastic particle collision simulator?
  • ActionScript 2 vs ActionScript 3 performance
  • ORA-29908: missing primary invocation for ancillary operator
  • How to get next/previous record number?
  • VB.net deserialize, JSON Conversion from type 'Dictionary(Of String,Object)' to type '
  • How can I get HTML syntax highlighting in my editor for CakePHP?
  • How do you join a server to an Active Directory (domain)?
  • How do I configure my settings file to work with unit tests?
  • How does Linux kernel interrupt the application?
  • How to get Windows thread pool to call class member function?
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • costura.fody for a dll that references another dll
  • Binding checkboxes to object values in AngularJs
  • Observable and ngFor in Angular 2
  • How to Embed XSL into XML
  • UserPrincipal.Current returns apppool on IIS
  • Conditional In-Line CSS for IE and Others?
  • java string with new operator and a literal