| Author |
Message |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 06/01/2011 04:58:22
|
dlouwers
Joined: 06/01/2011 04:47:58
Messages: 4
Offline
|
Hi,
I have a couple of questions regarding the system and hope someone is willing and able to answer them:
- From the FAQ I gather that reads are allowed concurrently with writes. Is this correct? I am asking this since reading the result of a half-finished write operation can cause unexpected behavior. If this is the case is it also possible to use a lock more akin to Java's ReentrantReadWriteLock to have all writes operate in isolation with possibility of a downgrade to a read lock?
- Say a system would need 50GB of memory and has this readily available, wouldn't this much RAM totally smash performance of the garbage collector to the point of uselessness? Do you know of a solution to this bottleneck?
Thanks in advance,
Dirk Louwers
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 06/01/2011 22:55:48
|
saoj
Joined: 05/09/2008 13:26:12
Messages: 46
Offline
|
dlouwers wrote:
- From the FAQ I gather that reads are allowed concurrently with writes. Is this correct? I am asking this since reading the result of a half-finished write operation can cause unexpected behavior. If this is the case is it also possible to use a lock more akin to Java's ReentrantReadWriteLock to have all writes operate in isolation with possibility of a downgrade to a read lock?
Yes, reads are concurrent with writes, like any modern database. Now you have to define unexpected behavior. If you have a object User and you want to freeze it, one easy option is to clone it. The thing with S4J is when you fetch an object from the Space, that's a live object.
- Say a system would need 50GB of memory and has this readily available, wouldn't this much RAM totally smash performance of the garbage collector to the point of uselessness? Do you know of a solution to this bottleneck?
I wouldn't think so. Are you saying that it is too much memory for the GC to track? I think the modern GCs strategies would not be affected by that.
If you want to be sure, create a lot of memory, call System.gc(), log the GC details and play with the various types of GC available in the JVM. Or do some research about the GC algorithms. I don't know this from the top of my head.
This message was edited 1 time. Last update was at 06/01/2011 22:56:04
|
-Sergio Oliveira Junior |
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 07/01/2011 03:07:43
|
dlouwers
Joined: 06/01/2011 04:47:58
Messages: 4
Offline
|
First of all. Let me make it clear that I am a very firm believer in Prevalence as a very useful pattern in a lot of situations.
Yes, reads are concurrent with writes, like any modern database. Now you have to define unexpected behavior. If you have a object User and you want to freeze it, one easy option is to clone it.
Even though a database will allow concurrent reads with a single write in most default isolation levels, it will still make sure that writes to a field are atomic. A user will never read a half written date field for instance. Not so in code executed by the JVM. Take a long on a 32-bit JVM for instance. From what I understand writing to it is not atomic, but in fact consists of 2 processor operations. It is possible for another thread to read from this long when only it's lower or higher order bytes have been written. That is what I call, if not unexpected behavior, at least undesired behavior. To prevent stale cache on multiple cores I would mark these member variables as volatile anyway and that would do away with this danger but more complex mutable data would still need to have access synchronized. One way to deal with this using the tools at hand is treating all reads like a write, but that would eliminate the possibility to have concurrent reads. And that is the reason why I am wondering about the possibility of ReetrantReadWriteLock like behavior, because that addresses the issue very neatly without the need for more granular (and thus error prone) locking.
The thing with S4J is when you fetch an object from the Space, that's a live object.
I would never suggest pulling these object across the boundary! However the issue I am mentioning here exists inside the space as well.
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 07/01/2011 10:03:39
|
saoj
Joined: 05/09/2008 13:26:12
Messages: 46
Offline
|
dlouwers wrote:
Even though a database will allow concurrent reads with a single write in most default isolation levels, it will still make sure that writes to a field are atomic. A user will never read a half written date field for instance. Not so in code executed by the JVM. Take a long on a 32-bit JVM for instance. From what I understand writing to it is not atomic, but in fact consists of 2 processor operations. It is possible for another thread to read from this long when only it's lower or higher order bytes have been written. That is what I call, if not unexpected behavior, at least undesired behavior. To prevent stale cache on multiple cores I would mark these member variables as volatile anyway and that would do away with this danger but more complex mutable data would still need to have access synchronized. One way to deal with this using the tools at hand is treating all reads like a write, but that would eliminate the possibility to have concurrent reads. And that is the reason why I am wondering about the possibility of ReetrantReadWriteLock like behavior, because that addresses the issue very neatly without the need for more granular (and thus error prone) locking.
Like you said this is a classic JVM problem, not a S4J problem. The problem is: A mutable object shared by two or more read and write threads. Should the setters and getters be synchronized? Should the instance variables made volatile? When I tacked this problem some years ago the answer was: single-core = NO; multi-core = YES. Plus: long and double = YES (64-bit), the rest = NO (32-bit or less). I would expect by now that primitive reads inside the JVM be guaranteed to be atomic, but I am not sure.
So this is a Java / Architecture problem. If you think that might be an issue, synchronize properly the write and read operation on your mutable object, so it can be thread-safe not just inside the Space but anywhere it is used in a multithreaded environment.
I would never suggest pulling these object across the boundary! However the issue I am mentioning here exists inside the space as well.
When you fetch a object, you cannot make a clone/copy of the object. A relational database deal with values (columns) so it does not have this problem, but a object database deals with objects so I don't see how you can escape from that. The objects will be live and you can do whatever you want with them, even modify them, what would be disastrous. Perhaps we could give a little more thought in that area for Space4J. My first guess is that you want to be able to iterate through your lists and maps inside the Space. To have to create a copy/clone of every iterated object is going to be bad.
This message was edited 4 times. Last update was at 07/01/2011 10:13:31
|
-Sergio Oliveira Junior |
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 11/01/2011 04:43:22
|
dlouwers
Joined: 06/01/2011 04:47:58
Messages: 4
Offline
|
Like you said this is a classic JVM problem, not a S4J problem.
I agree. However S4J can be a part of a solution to this problem.
When I tacked this problem some years ago the answer was: single-core = NO; multi-core = YES. Plus: long and double = YES (64-bit), the rest = NO (32-bit or less). I would expect by now that primitive reads inside the JVM be guaranteed to be atomic, but I am not sure.
Nowadays almost all platforms are multi-core (or semi-multi-core, like hyperthreading). Implementation of 64-bits numbers is left to the implementer by the JVM specs so no assumptions can be made.
So this is a Java / Architecture problem. If you think that might be an issue, synchronize properly the write and read operation on your mutable object, so it can be thread-safe not just inside the Space but anywhere it is used in a multi-threaded environment.
That is a possibility but highly granular locking without some form of higher order abstraction is extremely error prone and almost sure to cause deadlock issues or other issues that arise from data modification not being atomic. What are your thoughts on letting S4J be a possible solution to the problem and offer a more strict isolation mode by optionally offering access through a ReentrantReadWriteLock? Sure it would be less performant than allowing concurrent read/writes and leave the nitty-gritty to the users. But it would be a nice option. This together with the advise to never let mutable objects leave the prevalent system could protect users from most if not all concurrency pitfalls.
And yes I know this takes time and effort. I am just mentioning this for the sake of discussion to see if you see the merit in it.
My first guess is that you want to be able to iterate through your lists and maps inside the Space. To have to create a copy/clone of every iterated object is going to be bad.
Yes, it would possibly copy the whole application data if you were at a root object. Instead I would probably copy relevant values into a data structure to function as a view. If reads were isolated from writes and reads could still operate concurrently this copy would be consistent.
Best,
Dirk Louwers
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 12/01/2011 00:47:44
|
saoj
Joined: 05/09/2008 13:26:12
Messages: 46
Offline
|
dlouwers wrote:
That is a possibility but highly granular locking without some form of higher order abstraction is extremely error prone and almost sure to cause deadlock issues or other issues that arise from data modification not being atomic. What are your thoughts on letting S4J be a possible solution to the problem and offer a more strict isolation mode by optionally offering access through a ReentrantReadWriteLock? Sure it would be less performant than allowing concurrent read/writes and leave the nitty-gritty to the users. But it would be a nice option. This together with the advise to never let mutable objects leave the prevalent system could protect users from most if not all concurrency pitfalls.
And yes I know this takes time and effort. I am just mentioning this for the sake of discussion to see if you see the merit in it.
You convinced me about the merit of this.
Databases provide isolation levels, so S4J should provide too. LET'S DO IT. Actually, do you want to start coding something? I can provide you with commit access to svn. Just let me know your account on sourceforge.
|
-Sergio Oliveira Junior |
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 14/01/2011 03:50:10
|
dlouwers
Joined: 06/01/2011 04:47:58
Messages: 4
Offline
|
Actually, do you want to start coding something? I can provide you with commit access to svn. Just let me know your account on sourceforge.
Hi, my account name is dlouwers. I don't have time this month but I will familiarize myself with the code start of Feb. First thing that needs doing in my opinion is to refactor the transaction isolation out of the code if it hasn't already been separated.
Best,
Dirk Louwers
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 05/03/2012 21:56:04
|
saoj
Joined: 05/09/2008 13:26:12
Messages: 46
Offline
|
Ok, after a great chat with Dirk (and don't forget to check his awesome startup) we agreed that Space4J should only encourage IMMUTABLE objects in the Space. I see that the PhoneBook examples already enforce that. However it does not support phone editing. If it did, a command would have to create a new User object and replace the old one with the new one, in other words, not even a command should be allowed to change a mutable object in the space. That's because other threads may be reading this same object and bad things can happen, even if they are just reading instance variables (that's debatable, but better to be safe than sorry).
So all you have to do is issue a new PutCmd to replace the object in the map for the new one.
If you absolutely need to modify a MUTABLE object in the space, s4j will provide a demarcation feature backed up by a ReentrantReadWriteLock. Of course that lock should be used very carefully with the correct granularity, in other words, you would never want to lock for an excessive amount of time as this will be locking the main WRITING thread (but not the other reading threads). That would be a new ISOLATION LEVEL for s4j, in other words, by default this locking mechanism will not be used, so only if this ISOLATION level is turned ON, locking will be done using the ReentrantReadWriteLock.
This message was edited 7 times. Last update was at 05/03/2012 22:02:18
|
-Sergio Oliveira Junior |
|
|
 |
|
|