Implement an IsolateInvoker for the shell

Project:JNode Shell
Component:Code
Category:task
Priority:critical
Assigned:Unassigned
Status:closed
Description

The 'proclet' mechanism has some problems in some unusual use-cases; e.g. issue 2768. The solution is to bite the bullet and implement an invoker that will launch commands in their own isolates.

#1

I've checked in a partial implementation of a new isolate command invoker. So far I've been able to get it to execute a 'classic' Java command with no arguments; i.e. one which has a 'main' entry point and does not use the syntax mechanism. The immediate problem is that I haven't figured out how to notify the invoker that the isolate running the command has finished.

Levente: It looks isolate status messages are not yet implemented. Would you be able to do that or should I have a go? Also could you comment on the way that I'm passing command state to the child isolate; is this the right way to do it? If you are feeling brave, the way to try out the new invoker is to run the following from the JNode console:

    set jnode.debug true
    set jnode.invoker isolate
    <some.classic.command>

Warning: you won't get back the command prompt on the console.

#2

Several aspects of isolates are not fully developed yet. I'm making some progress with it but it takes time.

I'm not sure yet if this invoker based solution is the way to go because now someone will be able to enable the isolate invoker and according to my understaning starting from that moment everything will be invoked in a new isolate. If we stop for a moment and think this was never the goal. Instead the desired behaviour of the shell would be to run every kind of executable entity in the most suitable way for running it according to its runtime properies and based on its properties like the implementation of a give interface or the peresence of an annotation or configuration property.

Isolates would be most helpful in suporting the normal behaviour of the JVM running on current mainstream operating systems. The primary usecase is that the user wants to run an arbitrary program without much knowledge about its internals. It's obvious that such programs are best run in a new isolates to avoid any undesired interference with the core components of the system or with other similar user programs.

On the other hand for the majority of commands and components created within the JNode project isolates will not be needed and by using the other invocation methods we can achieve better startup performance and lower overhead in memory usage.

In my opinion we should pursue this goal in the default shell where without special setup in the shell but with possible configuration of the executables the shell will be able to select and carry out the optimal method of execution for an executable upon startup.

#3

I think that the java command is the better candidate for being run in a new isolate since, in other operating systems, each application is ran in a new jvm.

In the future, as our implementation of isolates evolve, we might even support -Xmx and -Xms options, the system properties and other jvm settings...

#4

It would be a bad idea to require the user to run commands in isolates explicitly. The problem is that functionality like 'kill' (^C) and 'stop' (^Z) can only be implemented for isolated commands. Most times, users won't predict that they are going to need to kill/stop a command, won't launch the command using the 'java' command, and therefore won't be able to ^C/^Z when they need to. In short, the 'java' command is a distinctly inferior solution from a usability perspective.

But this is beside the point. The missing isolate stuff that I will need for an isolate command invoker is also needed for a 'java' command. Immediately, I need the jaxax.isolate API methods which will allow me to tell if an isolate has "finished". Soon after I will need the methods that allow a parent isolate to stop and kill a child isolate. It looks like I'll need to implement these myself.

#5

Levente wrote this:

I'm not sure yet if this invoker based solution is the way to go because now someone will be able to enable the isolate invoker and according to my understanding starting from that moment everything will be invoked in a new isolate.

I think that this is not how things would work in the long term. My ideal behavior would be to transparently use the "right" invocation mechanism according to what is most likely to be best for the user. Here are some ideas:

  • A complete shell command line (entered interactively) should generally run in an isolate so that the user could ^C / ^Z it. The individual commands within a command line (e.g. in a pipeline) should all run in the same isolate by default.
  • A shell script or function would also run in a single isolate by default.
  • Individual commands could be marked so that they are launched with a specific invoker by default. For example, "set", "env" and a few other commands should be marked as not requiring a new isolate.
  • A special bit of shell syntax could be provided so that the user can override the default invocation behavior. For example, might use '!' to use the opposite of the default behavior so that:
        JNode> ! set jnode.debug true
        JNode> ! javac some.java
    

    would tell the shell to run the "set" command in another isolate, and "javac" in the shell's current isolate. (Yea ... the syntax/semantics are naff ... but its a starting point.)

I see the current shell implementation with its distinct 'invokers' and its cumbersome method of switching between them as being an interim solution. Lets not get too hung up on its obvious inadequacies.

#6

"To require the user to run commands in isolates explicitly" - it's definetly a bad idea and I will not suport such proposal.

Here is my proposal again in a distilled form (in the hope that this time it will not be misunderstood):
- the java command should always start a new isolate
- the shell should figure out from a command the optimal execution method of that command and execute it accordingly. This might include the thread based execution, proclet based execution and isolate based execution among others. The shell could make the decision based on various factors like: the proveninence of the code to execute, the presence of an implemented interface or superclass, the presence of an annotation, the presence of a configuration options in the command syntax specification etc. If no speciffic information was found for the execution of the command then the shell could consider it an unknown executable and for safety reasons start it in a new isolate.

For this the current invoker based solution would need some modification. The idea is that the invoker will not be predefined any more as a shell property but dinamically selected for each command according to the type of the executable refred to by the command. This model could also include in the future special invocation methods for scripts wirtten in various scripting languages.

Update: this post is in reply to post: #4

#7

You are free to try to implement anything including the two missing features of the isolation api that you mentioned.
I just would like to let you know that I'm also working it, making various experiments. The forced termination and suspending the excecution of an isolate are diffciult problems. Terminating an isolate basically means terminating its threads of execution. But at the moment we don't have a safe way for forcibly terminating a thread. For doing so we will need to make sure that after the thread terminated all its internal datastructures become elligible for GC and afer gc the system is left in a consistent state. It also requires that the thread can be terminated under any condition even if it's wating on an io operation like disk or socket access and that the thread will release all locks that have been held by it. These are basically open topics right now which need deep investigation first of all.

Regarding the normal finishing of an isolate, I have made some steps towards the solution but this is also quite complex if considered completely and it needs the support for terminating threads.
So far I have implemented support for the case when an isolate is single threaded. I made sure that when the only thread of the isolates is finished then the isolate geats cleaned up properly, which means that its internal datastructures become elligible for garbage collection. You can verify this by the means of the newly created IsolateCommand for listing isolates and the IsolateTest class for running code in a new isolate.
Further work is needed for supporting the case when the isolate is multithredead with no demon threads.
And yet more work is needed when the isolate has demon threads. This case is related to the thread termination bacuase the rule is that when all nondemon threads are finished the demon threads should be forcibily terminated and the isolate should exit.
Also to be done is the firing of the apropriate events on isolate startup and termination which is the easier part of the problem.

If you look into these problems it would be nice to discuss the findings and various solutions before the implementation.

#8

Terminating an isolate basically means terminating its threads of execution. But at the moment we don't have a safe way for forcibly terminating a thread. For doing so we will need to make sure that after the thread terminated all its internal data structures become eligible for GC and after gc the system is left in a consistent state. It also requires that the thread can be terminated under any condition even if it's waiting on an IO operation like disk or socket access and that the thread will release all locks that have been held by it. These are basically open topics right now which need deep investigation first of all.

One way to avoid problems is to create proxy versions of the key JNode services that call the real services using the isolate Link mechanism. When the isolate system need to forcibly kill an Isolate, it kills the blocked isolate threads and closes the links. Threads in the other isolate that are blocked in (orphaned) IO requests will eventually wake up. When they do, the service stub will attempt to return the results/exceptions via the Link, discover that the Link is closed and quietly drop them.

A related problem is making sure that one isolate cannot lock an object that belongs to another isolate. I think this boils down to avoiding the sharing of objects as far as possible, and being very careful when object sharing is necessary.

Finally there is the problem of making java.lang.Thread.kill "work" albeit unsafely. Last time I tried it, I ran into a kernel panic. We need to figure that out and fix it.

I suspect that the GC won't prove to be that difficult ... but that's just my gut feeling.

BTW, this is all "phase 2" work. Phase 1 is simply getting isolate commands to exit normally. With that in place we at least have an alternative to proclets that doesn't suffer from problems like the DaCapo one that started all of this.

#9

It would make the implementation cleaner if the classes in org.jnode.vm.isolate.links were moved to org.jnode.vm.isolate. Currently, a number of methods which should be 'package private' need to be declared as 'public'.

I think it is necessary to add a public constructor to javax.isolate.IsolateStatus. Otherwise we are going to need to use reflection to construct IsolateStatus objects.

Any comments Levente?

#10

I agree with moving the link stuff from org.jnode.vm.isolate.links to org.jnode.vm.isolate.

I'm not sure yet if ObjectLinkMessage is a good idea. On one hand there is no track of it in the spec and it would encurage passing arbitray objects from one isolate to tan other which should be avoided. On the other hand the only place you are using it will probably need a review. The current isolate invoker obfuscates the system because it hides the real class which gets executed in the output of the isolate command. I wonder if instead of passing the command running to the new isolate as a message, it would be better to start up the isolate in a new isolates speciffic command runner.
That could probably eliminate the need for ObjectLinkMessage and also make it clear what is executing.

Why do you think VmIsolate.start() should be a synchronized method?

#11

I'm not sure yet if ObjectLinkMessage is a good idea.

Me neither. In the current use-case (passing the CommandRunner), I probably should serialize the state and reconstruct it in the new isolate. I have a feeling that passing the reference to the Command object and the bound Argument objects in the CommandRunner could result in problems. But as you can imagine, there is a lot to do to get the required serialization and reconstruction working properly!

Why do you think VmIsolate.start() should be a synchronized method?

Because of interactions involving the status link and messages. For instance, thread #1 creates an isolate, and passes the handle to thread #2. Then, while thread #1 calls "start", thread #2 calls "newStatusLink". IMO, it is better to be cautious with synchronization because it is so difficult to reproduce / track down synchronization bugs. (And its only going to get harder when Peter gets multi-processor support working! In fact, I suspect that there are lots of synchronization bugs lurking the codebase just waiting to cause problems when MP is enabled ...)

#12

Serializing the CommandRunner will just make command startup a heavy operation.
Please favor simplicity and small code.
There should be a better solution for this problem which doesn't hide the isolate's main class either.

If you look carefully you will see that the part of the start method which changes the isolate state was already protected with a syncronized block on the instance. Therefor with respect to the newStatusLink() method making the whole start() method synchronized is not justified. Too much syncronization is just as bad as too little. The later exposes your data to concurency hazard and corruption while the former makes your code slow and prone to blockings and deadlocks.

#13

Serializing the CommandRunner will just make command startup a heavy operation.
Please favor simplicity and small code.
There should be a better solution for this problem which doesn't hide the isolate's main class either.

Do you have any suggestions as to how to do this?

Personally, I'm inclined to stick with the current approach (i.e. passing the CommandRunner by reference) until we come up with a better alternative.

-----------------

Re the issue of synchronization. Too much synchronization is not as bad as too little. Too much synchronization leads to excessive lock contention; i.e. the code is slow, but at least it works. Deadlocks (in Java) are caused by different threads trying to acquire the same set of locks in different orders. This is orthogonal to whether you are doing too much (or little) locking.

But even if too much locking caused deadlock, deadlocks are far easier to diagnose than race conditions caused by too little locking. With a deadlock, part of the system will freez, and a thread state dump will clearly tell you what threads and locks are involved. With a race condition, the typical symptom is that some state gets corrupted with little if any evidence as to how this happened. In my experience, race conditions are notoriously hard to diagnose.

IMO, our best strategy is to be conservative with locking while we are developing JNode. Later on we can revisit this, removing unnecessary locking, changing lock granularity and so on, to deal with performance issues revealed by profiling, etc. IMO, focusing on minimizing locking now is premature optimization. And even assuming we get it right in the first instant, changes in the code invalidate the assumptions that caused us to not lock something ... resulting in new race conditions.

So, notwithstanding that you may be right that the extra "synchronized" is unnecessary, I think we should leave it there for now.

#14

Assigned to:Stephen Crawley» Anonymous

Levente is now working on this issue.

#15

Status:active» fixed

The isolate invoker is (kind of) working, though aspects of the implementation need some more work. For example, the shell error stream is not handled properly, and the mechanisms for passing command line state to the new isolate are a hack, and potentially problematic.

I'm calling this issue fixed.

#16

Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

#17

Status:closed» active

Handling of the error stream is important for debugging the code running in separate isolates.
Is it possible to continue the work on this?

#18

I am really loath to get involved with isolate related stuff any more. Sorry.

#19

Status:active» postponed

I'm working on this again, and I've created a new tracking issue. Marking this one as fixed, in favour of the new one.
We actually do have a mostly working isolate invoker now.

#20

Status:postponed» fixed

#21

Status:fixed» closed

Manually closed.