[Logo] Space4J - Java Persistence
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Members]  Member Listing   [Groups] Back to home page 
[Moderation Log] Moderation Log   [Register] Register / 
[Login] Login 
Messages posted by: saoj
Forum Index » Profile for saoj » Messages posted by saoj
Author Message
Ok. When you are using a compound index, Space4J needs to know how to compare the index's key, which is of course a compound key.

It does come with a default way of comparing compound keys: the KeyComparator. It compares Strings, Dates and Numbers like you would expect.

If you want to change the default comparator, because you have some complex objects in the compound key, you have to implement you own comparator and pass to the Index when you create it.



It does not look like you need your own comparator. If you want to get a list that satisfy queueId = 1, shippingDone = 3 and timeNextSend = 3000 it is very easy:

1) You index should be a MULTI index, because it is not UNIQUE.

2) It can be sorted or non-sorted, depending if you want to fetch RANGES or if you want to fetch SORTED results.

3) After creating your index, all you have to do is:



If your index is sorted (MULTI_SORTED), you can fetch ranges:



Refer to there:

http://download.oracle.com/javase/1.4.2/docs/api/java/util/SortedMap.html

Plus I also recommend you to download and check the Space4J sources if you want to have a better understanding of its internals.

Let me know if it worked for you.





Not sure I followed it, can you rephrase?

Like a database, Space4J is not a cache but can be used like one. However you need an eviction policy cache. Java Collections support that, but not Space4J.

The bug recently correctly was more of a lack of feature. You can download the latest jar here: http://www.space4j.org/beta/space4j.jar or just use the stable version.

The RemoveCmd is to be used in conjunction with PutCmd, in other words, to remove an object from a Collection.

What you need is a RemoveObjectCmd to remove a object from the space. This was missing but I have added it there.

Please download the beta jar: http://www.space4j.org/beta/space4j.jar



No. Just a bug. I fixed it. Thanks very much. Can you share your thoughts and opinions about the Space4J source code and API ?

Riduidel wrote:Ah, sergio, you're kidding me !
From the svn view I get, I can see that Space4J already uses genercis and all Java5 constructs !


You are the second person that talks about lack of generics, so I thought it was missing somewhere in the project.

The commands to create the collections (Map or List) I know are using generics...



We actually prefer ant, but it would be ok to have both, maven and ant.

Feel free to work on the generic part.

-Sergio

Hummm. If the substring lookup is fixed then perhaps you can index it. But Space4J does not support this.

Something nice to be implemented by someone. Want to give it a try?

Or if the subset returned by price and discount is relatively small you can just iterate.
You can use two indexes (one for price and one for discount) or one composite index.

Check the code below that can be found here: http://s4j.mentaframework.org/posts/list/5.page

Now for the type.contains("mx"), there is no index that can index this, not even in a relational database. So you have to loop and check for each record you return.



You don't care about this. You execute the commands on whatever node you are, like there was no cluster at all.

The node will know whether it is the master (so it executes localy) or a slave (so it sends to the master and wait for the answer).

From your point of view, you need to do nothing and treat the application as a normal non-cluster application.

There is even a stress test for the cluster in the examples.

Hi Jonathan,


I'm currently building a social website for a company that foresees a population of 500000 users in the system. The system will be hosted in the cloud by Amazon Web Service. According to my calculation the size of the "database" will grow to 8 GB. I'm really interested by Space4J to carry out the persistence. Do you know/have any experience feedback with this kind of system ?


This is an area where Space4J can be used to solve some problems. For example: If you have to transverse the friends graph to find out the friends of friends of friends of friends that can be heavy on a database, but if you have the graph of friends in memory then you are just follow object references.

Problem: If you have too many users you may run out of memory RAM space. You have to think about that. Plus you have to take into account the size of the indexes as well.


Can index be built for a map that already have datas ? If yes, what is the expected behavior : access to index map is blocking ? Is there a way to be notified when the index is ready ?


Yes. That happens exactly like it happens on a relational database when you index an existing table. Access is blocked until the indexation is finished. The method that performs the indexation is blocked until the indexation is finished, in other words, it is a synchronous operation. This is a blocking UPDATE on Space4J (see below).


When i get an object from Space4j is it a copy ? If no, how does it work when two threads want to update the same object ? What is the good practice ?


The object is not a copy, is the object itself. The Space4J concept is only possible because UPDATES are serialized, in other words, they are atomic, isolated, happening one at a time. No two UPDATES are ever executed concurrently. Now when you talk about READS, then they are executed concurrently, thanks to the new Java 1.6 concurrent collections. Space4J is like Oracle: Updates only block updates. Reader don't block or get blocked by anything.


synchronous replication across the cluster supported ?


Space4J has a cluster ring implemented on itself. To make a cluster you need to do nothing. The good thing about a cluster, besides fault-tolerance and load balance, is that it allows a snapshot of the data to be taken without putting the whole system in read-only mode. Only one cluster node would have to enter read only mode and the whole system would continue to work normally while the snapshot is being taken on that cluster node. You can see a cluster example here: http://s4j.mentaframework.org/posts/list/6.page


You get a OutOfMemoryError and you have to increase your heap size with the -Xms options or eventually your physical RAM. Google about Java Heap size command options.

Amount of RAM should not be a problem, unless you have a really huge database. In this case you should probably use a regular relational database.

-Sergio

Are all serializable objects being held in memory during the working with Space4j????


Yes, all objects are in memory, but they are no serialized in memory. They get serialized when they are logged to disk in a command or when a snapshot is taken.


I mean, let's suppose that I've added 1K of objects into space4j...My question is, are all of these objects in memory? Or does it mean that half of these objects (the most used) is in memory and the second half is saved on HD???


All objects are always in memory, even if they are stale objects that are not being accessed.

I thought about a *passivation* strategy for Space4J in the past, that would swap unused/old objects from memory to disk. But that complicates the whole thing and loses the focus of Space4J which is in-memory access straight through collections.

If you have a logging table that grows indefintely and the information is not accessed regularly, you will be better off with a text log or a relational database. To keep this information in memory would be just a waste of RAM.


1) All objects all serialized in the "./space4_db" directory. Is somehow possible to set the directory and file where the objects will be serialized?


Download the beta jar from http://www.space4j.org/beta/space4j.jar and use the SimpleLogger.setDir("c:\\mydb") for example. You can also use a relative path like SimpleLogger.setDir("mydb").


2) I want to use REGULAR index (unique-index, non-sorted), but I didn't find an example of how to get my stored value....Of course I checked an example on this forum

User u = usersById.get(2345);

but I don't understand one thing, what type of the object is usersById???? Can you provide full example where REGULAR index is used.


usersById in this example is the map which is returned by the Index objetct.



Once you have created the index object by calling im.createIndex(indx, space4j); you can get its map to perform the lookup:



Note that you need a key object. You can even have composite indexes with composite keys (more than one attribute).

Also note that is recommended that you store your objects in the space inside Maps not Lists. The reason for that is the same reason for always having a Primary Key in a database table.

I have been thinking about this and my bet is that the bottleneck is on disk I/O.

Disk I/O after a write (or a commit in relational DB) is pretty much unescapable, because you need to make sure data is persisted and safe in case of a crash.

I am wondering who is faster. A relational database doing multiple inserts and commits in sequence or Space4J.

If you are really concerned about this speed, you can perform the disk I/O asynchronously, but the price is clear: you will not be sure when your write is persisted and safe in disk. After a crash, you will end up losing some past sucessful writes.

So two questions remain:

1) Is Space4J as fast as any regular relational database when it comes to muliple and consective inserts/commits ?

2) Is there a scenario where doing async disk i/o is actually desirable considering its cost?

Let me know your thoughts...


 
Forum Index » Profile for saoj » Messages posted by saoj
Go to:   
Powered by JForum 2.1.8 © JForum Team