Lisp HUG Maillist Archive

Tail Call Optimization

I know that LW can optimize tail calls in many cases. I have seen it many times. But being able to predict when it can occur is next to impossible.

I have instances where the tail call is through a captured closure stored in a local binding, or in one of the bindings of the closure of the current function. And in those cases it seems that LW refuses to perform a tail call through that pointer, no matter how I declare the optimization to the compiler. There is only a single value that I’m passing to the captured closure, so it isn’t a question of too many args, or keyword args, etc., interfering.

So it seems that compiler declarations are a weak method of invoking what you want, when you absolutely know you should be getting tail calls. Why not implement a Lisp primitive that directly performs a tail call, i.e., JMP with args?

(Stupid) Example that blows the stack:

(let ((sav-k #'values)
      (sav-n 500))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
               (return-from myblock
                 (if (< n 2)
                     (funcall k 1)
                   (let ((k (let ((k k)  ;; form our own captures just to be certain...
                                  (n n)) ;; should not be necessary here
                              (lambda (t1)
                                (funcall k (* t1 n))))))
                     (setf sav-k k
                           sav-n (1- n))
                     (go top))
                   ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

Re: Tail Call Optimization

Here is what I managed to come up with. It no longer blows the stack. But the runtime is likely suffering a bit, compared to what a direct tail-call could provide...

(define-condition values-signal ()
  ((args  :reader values-signal-args :initarg :args)))

(defun retk (&rest args)
  (error (make-condition 'values-signal :args args)))

(defun fwdk (c)
  (apply 'funcall (values-signal-args c)))

(let ((sav-k #'values)
      (sav-n 5000))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
                   (return-from myblock
                     (handler-case
                         (if (< n 2)
                             (retk k 1) ;; instead of funcall
                           (let ((k (lambda (t1)
                                      (retk k (* n t1)))))
                             (setf sav-k k
                                   sav-n (1- n))
                             (go top)))
                       (values-signal (c)
                         (block hndlr-block
                           (tagbody again
                                    (handler-case
                                        (return-from hndlr-block
                                          (fwdk c))
                                      (values-signal (cnew)
                                        (setf c cnew)
                                        (go again))))))
                       ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

On Oct 3, 2020, at 11:43 AM, dbm@refined-audiometrics.com wrote:

I know that LW can optimize tail calls in many cases. I have seen it many times. But being able to predict when it can occur is next to impossible.

I have instances where the tail call is through a captured closure stored in a local binding, or in one of the bindings of the closure of the current function. And in those cases it seems that LW refuses to perform a tail call through that pointer, no matter how I declare the optimization to the compiler. There is only a single value that I’m passing to the captured closure, so it isn’t a question of too many args, or keyword args, etc., interfering.

So it seems that compiler declarations are a weak method of invoking what you want, when you absolutely know you should be getting tail calls. Why not implement a Lisp primitive that directly performs a tail call, i.e., JMP with args?

(Stupid) Example that blows the stack:

(let ((sav-k #'values)
      (sav-n 500))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
               (return-from myblock
                 (if (< n 2)
                     (funcall k 1)
                   (let ((k (let ((k k)  ;; form our own captures just to be certain...
                                  (n n)) ;; should not be necessary here
                              (lambda (t1)
                                (funcall k (* t1 n))))))
                     (setf sav-k k
                           sav-n (1- n))
                     (go top))
                   ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

Re: Tail Call Optimization

A long time ago, I wrote a version of named let fir CL.  It had to run on Genera, which did not optimise tail calls, at all.  So on genera it turned the tail call into a GO.  And I thought I was very clever.  But now it was this stupid fragile thing since it also turned non-tail calls into GOs of course, and broke everything.

I never used my clever hack after the first time it blew up in my face.

The solution to an implementation not optimising tail calls when you want it to is one of two things, I think:

- rewrite the loops as loops, perhaps with some macro which helps you express the loops you want better;
- and/or persuade the people who look after the implementation that this is a case where it should be optimising the tail call away.

--tim

On 4 Oct 2020, at 02:21, dbm@refined-audiometrics.com wrote:

Here is what I managed to come up with. It no longer blows the stack. But the runtime is likely suffering a bit, compared to what a direct tail-call could provide...

(define-condition values-signal ()
  ((args  :reader values-signal-args :initarg :args)))

(defun retk (&rest args)
  (error (make-condition 'values-signal :args args)))

(defun fwdk (c)
  (apply 'funcall (values-signal-args c)))

(let ((sav-k #'values)
      (sav-n 5000))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
                   (return-from myblock
                     (handler-case
                         (if (< n 2)
                             (retk k 1) ;; instead of funcall
                           (let ((k (lambda (t1)
                                      (retk k (* n t1)))))
                             (setf sav-k k
                                   sav-n (1- n))
                             (go top)))
                       (values-signal (c)
                         (block hndlr-block
                           (tagbody again
                                    (handler-case
                                        (return-from hndlr-block
                                          (fwdk c))
                                      (values-signal (cnew)
                                        (setf c cnew)
                                        (go again))))))
                       ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

On Oct 3, 2020, at 11:43 AM, dbm@refined-audiometrics.com wrote:

I know that LW can optimize tail calls in many cases. I have seen it many times. But being able to predict when it can occur is next to impossible.

I have instances where the tail call is through a captured closure stored in a local binding, or in one of the bindings of the closure of the current function. And in those cases it seems that LW refuses to perform a tail call through that pointer, no matter how I declare the optimization to the compiler. There is only a single value that I’m passing to the captured closure, so it isn’t a question of too many args, or keyword args, etc., interfering.

So it seems that compiler declarations are a weak method of invoking what you want, when you absolutely know you should be getting tail calls. Why not implement a Lisp primitive that directly performs a tail call, i.e., JMP with args?

(Stupid) Example that blows the stack:

(let ((sav-k #'values)
      (sav-n 500))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
               (return-from myblock
                 (if (< n 2)
                     (funcall k 1)
                   (let ((k (let ((k k)  ;; form our own captures just to be certain...
                                  (n n)) ;; should not be necessary here
                              (lambda (t1)
                                (funcall k (* t1 n))))))
                     (setf sav-k k
                           sav-n (1- n))
                     (go top))
                   ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

Re: Tail Call Optimization

The documentation is quite vague. It turns out, after some experiments on this end, that the presence of either &KEY or &REST args destroys the ability to provide TCO. The use of &OPTIONAL seem to be okay, including those with init forms, and there seems no practical limit on them, nor on the number of simple args, for X86.

The use of GO is not quite appropriate in many cases, since the continuation closures are often coming from unknown and arbitrary places. A GO requires a known tag.

It turns out that the restriction to avoid &REST is qutie severe in many cases, as it prevents the easy construction of more general purpose macros. I suppose something could be cobbled together using a long list of &OPTIONAL args.

- DM


_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html

Re: Tail Call Optimization

It turns out that, while a “proper implementation” would have the continuation calls be terminal, in Lisp they aren’t and you can have activity following that continuation call.

Lisp shares some programming traits in common with my old Forth languages - in that as long as you write things properly, then things will work. Lisp also leaves things out in the open for access by “constrained adults”, whereas other languages seek to hide through encapsulation. On the whole, I think I prefer the Lisp way.



> On Oct 4, 2020, at 2:18 AM, dbm@refined-audiometrics.com wrote:
> 
> The documentation is quite vague. It turns out, after some experiments on this end, that the presence of either &KEY or &REST args destroys the ability to provide TCO. The use of &OPTIONAL seem to be okay, including those with init forms, and there seems no practical limit on them, nor on the number of simple args, for X86.
> 
> The use of GO is not quite appropriate in many cases, since the continuation closures are often coming from unknown and arbitrary places. A GO requires a known tag.
> 
> It turns out that the restriction to avoid &REST is qutie severe in many cases, as it prevents the easy construction of more general purpose macros. I suppose something could be cobbled together using a long list of &OPTIONAL args.
> 
> - DM
> 
> 
> _______________________________________________
> Lisp Hug - the mailing list for LispWorks users
> lisp-hug@lispworks.com
> http://www.lispworks.com/support/lisp-hug.html


_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html

Re: Tail Call Optimization

So, in summary,

1. The original code actually runs fine once compiled. 

2. No need for the HANDLER-CASE to consolidate continuation calls at one stack level (you can do, but then it causes problems if the continuation is picked up in another thread from where it was constructed)

3. TCO works just fine, nice and speedy, compared to HANDLER-CASE, so long as you avoid continuation closures that accept &KEY or &REST args.

I thought I was faithfully showing my complaint case in the original code, but alas my macros inserted &REST args that I did not faithfully carry over to the stupid example code. However, in contrast to the documentation, the stupid example contains nothing with dynamic scope, and so the documentation would have us believe that TCO will work there. It does, in the code that I showed, but it didn’t in my macrology becauase of the &REST args.

It would still be nice to have a forced, terminating, GoTo indirect, with args… I can do that in Assembly language, so why should Lisp be hobbled?

- DM

> On Oct 4, 2020, at 2:23 AM, dbm@refined-audiometrics.com wrote:
> 
> It turns out that, while a “proper implementation” would have the continuation calls be terminal, in Lisp they aren’t and you can have activity following that continuation call.
> 
> Lisp shares some programming traits in common with my old Forth languages - in that as long as you write things properly, then things will work. Lisp also leaves things out in the open for access by “constrained adults”, whereas other languages seek to hide through encapsulation. On the whole, I think I prefer the Lisp way.
> 
> 
> 
>> On Oct 4, 2020, at 2:18 AM, dbm@refined-audiometrics.com wrote:
>> 
>> The documentation is quite vague. It turns out, after some experiments on this end, that the presence of either &KEY or &REST args destroys the ability to provide TCO. The use of &OPTIONAL seem to be okay, including those with init forms, and there seems no practical limit on them, nor on the number of simple args, for X86.
>> 
>> The use of GO is not quite appropriate in many cases, since the continuation closures are often coming from unknown and arbitrary places. A GO requires a known tag.
>> 
>> It turns out that the restriction to avoid &REST is qutie severe in many cases, as it prevents the easy construction of more general purpose macros. I suppose something could be cobbled together using a long list of &OPTIONAL args.
>> 
>> - DM
>> 
>> 
>> _______________________________________________
>> Lisp Hug - the mailing list for LispWorks users
>> lisp-hug@lispworks.com
>> http://www.lispworks.com/support/lisp-hug.html
> 
> 
> _______________________________________________
> Lisp Hug - the mailing list for LispWorks users
> lisp-hug@lispworks.com
> http://www.lispworks.com/support/lisp-hug.html


_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html

Re: Tail Call Optimization

As for the Named Let example code, that NLET-TAIL is to be used only when you now you will recurse from tail position. 

And the whole point of my exercise was to take something with non-tail recursion and force it into tail recursion by using continuations. It works. For non-tail recursion I have NLET, which also happens to perform TCO when used in tail-recursive situations, so long as you avoid &KEY and &REST args in the function. 

But NLET will not turn anything into a GO, while NLET-TAIL always does.

There is no need to be ruled by the LOOP…

- DM



_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html

Re: Tail Call Optimization

… and with suitable macrology, this condenses nicely into:

(=nlet-tail fact ((n 500))
  (if (< n 2)
      (=values 1)
    (=bind (t1)
        (fact (1- n))
      (=values (* n t1)))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA


On Oct 3, 2020, at 6:18 PM, dbm@refined-audiometrics.com wrote:

Here is what I managed to come up with. It no longer blows the stack. But the runtime is likely suffering a bit, compared to what a direct tail-call could provide...

(define-condition values-signal ()
  ((args  :reader values-signal-args :initarg :args)))

(defun retk (&rest args)
  (error (make-condition 'values-signal :args args)))

(defun fwdk (c)
  (apply 'funcall (values-signal-args c)))

(let ((sav-k #'values)
      (sav-n 5000))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
                   (return-from myblock
                     (handler-case
                         (if (< n 2)
                             (retk k 1) ;; instead of funcall
                           (let ((k (lambda (t1)
                                      (retk k (* n t1)))))
                             (setf sav-k k
                                   sav-n (1- n))
                             (go top)))
                       (values-signal (c)
                         (block hndlr-block
                           (tagbody again
                                    (handler-case
                                        (return-from hndlr-block
                                          (fwdk c))
                                      (values-signal (cnew)
                                        (setf c cnew)
                                        (go again))))))
                       ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

On Oct 3, 2020, at 11:43 AM, dbm@refined-audiometrics.com wrote:

I know that LW can optimize tail calls in many cases. I have seen it many times. But being able to predict when it can occur is next to impossible.

I have instances where the tail call is through a captured closure stored in a local binding, or in one of the bindings of the closure of the current function. And in those cases it seems that LW refuses to perform a tail call through that pointer, no matter how I declare the optimization to the compiler. There is only a single value that I’m passing to the captured closure, so it isn’t a question of too many args, or keyword args, etc., interfering.

So it seems that compiler declarations are a weak method of invoking what you want, when you absolutely know you should be getting tail calls. Why not implement a Lisp primitive that directly performs a tail call, i.e., JMP with args?

(Stupid) Example that blows the stack:

(let ((sav-k #'values)
      (sav-n 500))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
               (return-from myblock
                 (if (< n 2)
                     (funcall k 1)
                   (let ((k (let ((k k)  ;; form our own captures just to be certain...
                                  (n n)) ;; should not be necessary here
                              (lambda (t1)
                                (funcall k (* t1 n))))))
                     (setf sav-k k
                           sav-n (1- n))
                     (go top))
                   ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA


Re: Tail Call Optimization

I FOUND IT! 

As long as you call a Lambda function with simple arguments as the captured continuation, the tail forwarding calls work with TCO. My original macros were expanding the arguments lists so that a function like this:

(lambda (x) ...)

were actually expanded to this:

(lambda (&optional x &rest ignored) ... )

And that seemingly simple alteration is all that was needed to prevent the TCO inside the Lambda closure.

- DM

On Oct 3, 2020, at 6:22 PM, dbm@refined-audiometrics.com wrote:

… and with suitable macrology, this condenses nicely into:

(=nlet-tail fact ((n 500))
  (if (< n 2)
      (=values 1)
    (=bind (t1)
        (fact (1- n))
      (=values (* n t1)))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA


On Oct 3, 2020, at 6:18 PM, dbm@refined-audiometrics.com wrote:

Here is what I managed to come up with. It no longer blows the stack. But the runtime is likely suffering a bit, compared to what a direct tail-call could provide...

(define-condition values-signal ()
  ((args  :reader values-signal-args :initarg :args)))

(defun retk (&rest args)
  (error (make-condition 'values-signal :args args)))

(defun fwdk (c)
  (apply 'funcall (values-signal-args c)))

(let ((sav-k #'values)
      (sav-n 5000))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
                   (return-from myblock
                     (handler-case
                         (if (< n 2)
                             (retk k 1) ;; instead of funcall
                           (let ((k (lambda (t1)
                                      (retk k (* n t1)))))
                             (setf sav-k k
                                   sav-n (1- n))
                             (go top)))
                       (values-signal (c)
                         (block hndlr-block
                           (tagbody again
                                    (handler-case
                                        (return-from hndlr-block
                                          (fwdk c))
                                      (values-signal (cnew)
                                        (setf c cnew)
                                        (go again))))))
                       ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

On Oct 3, 2020, at 11:43 AM, dbm@refined-audiometrics.com wrote:

I know that LW can optimize tail calls in many cases. I have seen it many times. But being able to predict when it can occur is next to impossible.

I have instances where the tail call is through a captured closure stored in a local binding, or in one of the bindings of the closure of the current function. And in those cases it seems that LW refuses to perform a tail call through that pointer, no matter how I declare the optimization to the compiler. There is only a single value that I’m passing to the captured closure, so it isn’t a question of too many args, or keyword args, etc., interfering.

So it seems that compiler declarations are a weak method of invoking what you want, when you absolutely know you should be getting tail calls. Why not implement a Lisp primitive that directly performs a tail call, i.e., JMP with args?

(Stupid) Example that blows the stack:

(let ((sav-k #'values)
      (sav-n 500))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
               (return-from myblock
                 (if (< n 2)
                     (funcall k 1)
                   (let ((k (let ((k k)  ;; form our own captures just to be certain...
                                  (n n)) ;; should not be necessary here
                              (lambda (t1)
                                (funcall k (* t1 n))))))
                     (setf sav-k k
                           sav-n (1- n))
                     (go top))
                   ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA



Re: Tail Call Optimization

http://www.lispworks.com/documentation/lw71/LW/html/lw-66.htm#pgfId-887598

the documentation says:


In 64-bit LispWorks and on x86 platforms the compiler optimizes tail calls unless

  1. The compiler optimize quality debug is 3, or
  2. There is something with dynamic scope on the stack, such as a special binding, a catch or cl:dynamic-extent allocation (so it is not really a tail call)
  3. On all other platforms the compiler optimizes tail calls unless 1.) or 2.) above apply, or

  4. The call has more than 4 arguments and this is more than the number of fixed (not &optional/&rest/&key) parameters in the calling function.
  5. The call has more than 4 arguments and the calling function has &rest/&key parameters.

Am 04.10.2020 um 08:40 schrieb dbm@refined-audiometrics.com:

I FOUND IT! 

As long as you call a Lambda function with simple arguments as the captured continuation, the tail forwarding calls work with TCO. My original macros were expanding the arguments lists so that a function like this:

(lambda (x) ....)

were actually expanded to this:

(lambda (&optional x &rest ignored) ... )

And that seemingly simple alteration is all that was needed to prevent the TCO inside the Lambda closure.

- DM

On Oct 3, 2020, at 6:22 PM, dbm@refined-audiometrics.com wrote:

… and with suitable macrology, this condenses nicely into:

(=nlet-tail fact ((n 500))
  (if (< n 2)
      (=values 1)
    (=bind (t1)
        (fact (1- n))
      (=values (* n t1)))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA


On Oct 3, 2020, at 6:18 PM, dbm@refined-audiometrics.com wrote:

Here is what I managed to come up with. It no longer blows the stack. But the runtime is likely suffering a bit, compared to what a direct tail-call could provide...

(define-condition values-signal ()
  ((args  :reader values-signal-args :initarg :args)))

(defun retk (&rest args)
  (error (make-condition 'values-signal :args args)))

(defun fwdk (c)
  (apply 'funcall (values-signal-args c)))

(let ((sav-k #'values)
      (sav-n 5000))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
                   (return-from myblock
                     (handler-case
                         (if (< n 2)
                             (retk k 1) ;; instead of funcall
                           (let ((k (lambda (t1)
                                      (retk k (* n t1)))))
                             (setf sav-k k
                                   sav-n (1- n))
                             (go top)))
                       (values-signal (c)
                         (block hndlr-block
                           (tagbody again
                                    (handler-case
                                        (return-from hndlr-block
                                          (fwdk c))
                                      (values-signal (cnew)
                                        (setf c cnew)
                                        (go again))))))
                       ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA

On Oct 3, 2020, at 11:43 AM, dbm@refined-audiometrics.com wrote:

I know that LW can optimize tail calls in many cases. I have seen it many times. But being able to predict when it can occur is next to impossible.

I have instances where the tail call is through a captured closure stored in a local binding, or in one of the bindings of the closure of the current function. And in those cases it seems that LW refuses to perform a tail call through that pointer, no matter how I declare the optimization to the compiler. There is only a single value that I’m passing to the captured closure, so it isn’t a question of too many args, or keyword args, etc., interfering.

So it seems that compiler declarations are a weak method of invoking what you want, when you absolutely know you should be getting tail calls. Why not implement a Lisp primitive that directly performs a tail call, i.e., JMP with args?

(Stupid) Example that blows the stack:

(let ((sav-k #'values)
      (sav-n 500))
  (block myblock
    (tagbody top
             (let ((k sav-k)
                   (n sav-n))
               (return-from myblock
                 (if (< n 2)
                     (funcall k 1)
                   (let ((k (let ((k k)  ;; form our own captures just to be certain...
                                  (n n)) ;; should not be necessary here
                              (lambda (t1)
                                (funcall k (* t1 n))))))
                     (setf sav-k k
                           sav-n (1- n))
                     (go top))
                   ))))))

- David McClain
Refined Audiometrics Laboratory
refined-audiometrics.com
Tucson, AZ, USA




Updated at: 2020-12-10 08:28 UTC